Home » RDBMS Server » Server Administration » PROBLEMS IN CONVERTING DATABASE FROM US7ASCII TO UTF8
PROBLEMS IN CONVERTING DATABASE FROM US7ASCII TO UTF8 [message #53180] Sun, 01 September 2002 02:16 Go to next message
Khatri
Messages: 1
Registered: September 2002
Junior Member
CONVERTING DATABASE FROM US7ASCII TO UTF8
------------------------------------------

I am currently converting my database character set from the US7ASCII
character set to a UTF8 character set. The client applications are all Java
( which by default uses UCS2 ) via the JDBC 1.2 thin drivers.

What I did was I changed the character set of the database to UTF8,
I also changed the national character set to UTF8. This was done through the ALTER DATABASE CHARACTER SET UTF8 command.

I started geting the following exception
"java.sql.SQLException: Fail to convert between UTF8 and UCS2: failUTF8Conv"
;

Narrowing down the problem it was observed many rows where some columns
had characters beyond the ASCII set ( like an e with an acute on top ).
These were a plain VARCHAR2 columns.

I looked at the production database ( which is in US7ASCII ) to see if it had been like
that before, and it was showing fine there.

Of course, the original problem is that the non-ASCII character got inserted
into the database in the first place, when the character set was still
US7ASCII. The JDBC driver then receives ( 201 ) and ( 86 ) as a single
character, thereby failing because it is not a valid UTF8 character(!?).

So my questions are:

1) Does anybody know of any way to look for non-ASCII characters for a
specified set of tables .... and then convert them into the proper UTF8
encoding??

I then tried exporting that table from the US7ASCII, but having my
NLS_LANG=AMERICAN_AMERICA.UTF8 and NLS_CHAR=UTF8 ... and then import that
data into the UTF8 database, also having NLS_LANG=AMERICAN_AMERICA.UTF8 and
NLS_CHAR=UTF8 during the import.

That non-ASCII character, the e with an acute, became the letter h and many more changed that way.

So my alternative of exporting the US7ASCII data as UTF8 did not work either.

I also tried exporting the US7ASCII data as US7ASCII and then importing it as UTF8
(with NLS_LANG=AMERICAN.AMERICA.UTF8) but it does not allow me to ( I got some IMP error ).

Any other ideas how to properly convert the database to UTF8 from US7ASCII character
set without getting the data corrupted?
Re: PROBLEMS IN CONVERTING DATABASE FROM US7ASCII TO UTF8 [message #54610 is a reply to message #53180] Fri, 22 November 2002 17:32 Go to previous message
Trifon Anguelov
Messages: 514
Registered: June 2002
Senior Member
"I also tried exporting the US7ASCII data as US7ASCII and then importing it as UTF8
(with NLS_LANG=AMERICAN.AMERICA.UTF8) but it does not allow me to ( I got some IMP error )."

You are getting this error because of the wrong character setup on import. See this note how to do it Click Here

"1) Does anybody know of any way to look for non-ASCII characters for a
specified set of tables .... and then convert them into the proper UTF8
encoding??"

You can always use the Character Set Scanner to do that before the import.

Hope that helps,

clio_usa
OCP - DBA

Visit our Web site

Previous Topic: Oracle DBA on Compaq True 64 bit Unix Required.
Next Topic: end of file on communication channel in Oracle8i
Goto Forum:
  


Current Time: Thu Sep 19 22:39:39 CDT 2024