Home > Archive > SQL Server JDBC > July 2005 > Translated characters stored in database









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Translated characters stored in database
BirchBarlow

2005-05-23, 3:23 am

Is this a known issue? I am using the MS JDBC driver to connect to SQL
Server 2000 SP3a. The driver version is either SP2 or SP3. I tested both
with the same results. The database collation is Latin1. Sun java 1.3.1 or
1.4.2. When using the SendStringParameters
AsUnicode=false in the connection
string the driver translates the characters in the 0x80 to 0x9f range. The
translation does not occur when setting the above connection parameter to
true. I have also tested the Data Direct driver and there is no character
translation with the parameter true or false. I have enclosed a sample piece
of code to illustrate the issue. I need to use the parameter as false since
there is a significant performance hit with it set to true. Also my
application requires support for the full Latin1 (Windows 1252) character
set. It looks like the character translation is a result of dropping the
upper byte of the Unicode character which for all characters except 0x80 -
0x9f is 0x00. If you look at the Windows 1252 character set Unicode values
you will see that the 0x80 to 0x9f characters have a value in the upper byte
therefore simply truncating the upper byte produces a character translation.
Example: Character 0x80 has the Uniccode value of 0x20AC and is translated to
0xAC in the database.

table1:
varchar(300);

file: infile.txt is a binary file with the characters bytes 0x20 to 0xff

Class.forName("com.microsoft.jdbc.sqlserver.SQLServerDriver");
con =
DriverManager.getConnection(" jdbc:microsoft:sqlse
rver:// myserver;SelectMetho
d=cursor;SendStringP
arametersAsUnicode=f
alse;user=test;passw
ord=test;DatabaseNam
e=TestDB"
);

BufferedReader r = new BufferedReader(new FileReader("infile.txt"));
String text = r.readLine();

PreparedStatement pstmt = null;
String sqlQuery = "insert into table1 (text) values (?)";

pstmt = con. prepareStatement(sql
Query);
pstmt.setString(1, text);
pstmt.execute();

pstmt.close();
pstmt=null;
con.close();
con=null;
Alin Sinpalean

2005-05-23, 7:23 am

Looks like a driver bug. You could try the jTDS driver (
http://jtds.sourceforge.net/ ), it will work just fine.

Disclaimer: I am a jTDS developer.

Alin.

Kamil Sykora [MSFT]

2005-05-24, 11:23 am

Hello,

I tested your code with this binary data in the infile.txt:

20 21 22 23 24 25 26 27 28 29 30 7B 7C 7D 7E 7F 80 82 83 84 85 86 87 88 89
8A C0 C1 C2 C3 C4 C5 F0 F1 F2 F3 FE FF

I then insert the data with the code below into the database. The result is
below. Note that the characters are probably not displaying correctly but
the hex values should:

select text, cast (text as varbinary(40)) from texttable

text


----------------------------------------------------------------------------
------------------------
----------------------------------------------------------------------------
------
!"#$%&'()0& #123;|}~€‚ƒ„…†‡ˆ‰ŠÀ
ÁÂÃÄÅðñòóþÿ

0x202122232425262728
29307B7C7D7E7F808283
8485868788898AC0C1C2
C3C4C5F0F1F2F3FE
FF


This is what I would expect. Are you seeing different results? If so, what
data are you seeing?

Thanks,
Kamil

Kamil Sykora
Microsoft Developer Support - Web Data

Please reply only to the newsgroups.
This posting is provided "AS IS" with no warranties, and confers no rights.


Are you secure? For information about the Strategic Technology Protection
Program and to order your FREE Security Tool Kit, please visit
http://www.microsoft.com/secur_ity.



--------------------
| From: "=?Utf-8?B?QmlyY2hCYXJsb3c=?="
< BirchBarlow@discussi
ons.microsoft.com>
| Subject: Translated characters stored in database
| Date: Sun, 22 May 2005 21:26:07 -0700
| Lines: 43
|
| Is this a known issue? I am using the MS JDBC driver to connect to SQL
| Server 2000 SP3a. The driver version is either SP2 or SP3. I tested
both
| with the same results. The database collation is Latin1. Sun java 1.3.1
or
| 1.4.2. When using the SendStringParameters
AsUnicode=false in the
connection
| string the driver translates the characters in the 0x80 to 0x9f range.
The
| translation does not occur when setting the above connection parameter to
| true. I have also tested the Data Direct driver and there is no
character
| translation with the parameter true or false. I have enclosed a sample
piece
| of code to illustrate the issue. I need to use the parameter as false
since
| there is a significant performance hit with it set to true. Also my
| application requires support for the full Latin1 (Windows 1252) character
| set. It looks like the character translation is a result of dropping the
| upper byte of the Unicode character which for all characters except 0x80
-
| 0x9f is 0x00. If you look at the Windows 1252 character set Unicode
values
| you will see that the 0x80 to 0x9f characters have a value in the upper
byte
| therefore simply truncating the upper byte produces a character
translation.
| Example: Character 0x80 has the Uniccode value of 0x20AC and is
translated to
| 0xAC in the database.
|
| table1:
| varchar(300);
|
| file: infile.txt is a binary file with the characters bytes 0x20 to 0xff
|
| Class.forName("com.microsoft.jdbc.sqlserver.SQLServerDriver");
| con =
|
DriverManager.getConnection(" jdbc:microsoft:sqlse
rver:// myserver;SelectMetho

d=cursor;SendStringP
arametersAsUnicode=f
alse;user=test;passw
ord=test;Databas
eName=TestDB"
| );
|
| BufferedReader r = new BufferedReader(new FileReader("infile.txt"));
| String text = r.readLine();
|
| PreparedStatement pstmt = null;
| String sqlQuery = "insert into table1 (text) values (?)";
|
| pstmt = con. prepareStatement(sql
Query);
| pstmt.setString(1, text);
| pstmt.execute();
|
| pstmt.close();
| pstmt=null;
| con.close();
| con=null;
|

Kamil Sykora [MSFT]

2005-05-24, 8:23 pm

I was able to reproduce the problem. Sorry about the confusion.

I filed a bug on this problem.

Thanks,
Kamil

Kamil Sykora
Microsoft Developer Support - Web Data

Please reply only to the newsgroups.
This posting is provided "AS IS" with no warranties, and confers no rights.


Are you secure? For information about the Strategic Technology Protection
Program and to order your FREE Security Tool Kit, please visit
http://www.microsoft.com/secur_ity.



| From: kamils@online.microsoft.com (Kamil Sykora [MSFT])
| Organization: Microsoft Developer Support
| Date: Tue, 24 May 2005 16:12:13 GMT
| Subject: RE: Translated characters stored in database
|
| Hello,
|
| I tested your code with this binary data in the infile.txt:
|
| 20 21 22 23 24 25 26 27 28 29 30 7B 7C 7D 7E 7F 80 82 83 84 85 86 87 88
89
| 8A C0 C1 C2 C3 C4 C5 F0 F1 F2 F3 FE FF
|
| I then insert the data with the code below into the database. The result
is
| below. Note that the characters are probably not displaying correctly but
| the hex values should:
|
| select text, cast (text as varbinary(40)) from texttable
|
| text

|

|
|
----------------------------------------------------------------------------
| ------------------------
|
----------------------------------------------------------------------------
| ------
| !"#$%&'()0& #123;|}~€‚ƒ„…†‡ˆ‰ŠÀ
ÁÂÃÄÅðñòóþÿ

|
|
0x202122232425262728
29307B7C7D7E7F808283
8485868788898AC0C1C2
C3C4C5F0F1F2F3FE
| FF
|
|
| This is what I would expect. Are you seeing different results? If so,
what
| data are you seeing?
|
| Thanks,
| Kamil
|
| Kamil Sykora
| Microsoft Developer Support - Web Data
|
| Please reply only to the newsgroups.
| This posting is provided "AS IS" with no warranties, and confers no
rights.
|
|
| Are you secure? For information about the Strategic Technology
Protection
| Program and to order your FREE Security Tool Kit, please visit
| http://www.microsoft.com/secur_ity.
|
|
|
| --------------------
| | From: "=?Utf-8?B?QmlyY2hCYXJsb3c=?="
| < BirchBarlow@discussi
ons.microsoft.com>
| | Subject: Translated characters stored in database
| | Date: Sun, 22 May 2005 21:26:07 -0700
| | Lines: 43
| |
| | Is this a known issue? I am using the MS JDBC driver to connect to SQL
| | Server 2000 SP3a. The driver version is either SP2 or SP3. I tested
| both
| | with the same results. The database collation is Latin1. Sun java
1.3.1
| or
| | 1.4.2. When using the SendStringParameters
AsUnicode=false in the
| connection
| | string the driver translates the characters in the 0x80 to 0x9f range.
| The
| | translation does not occur when setting the above connection parameter
to
| | true. I have also tested the Data Direct driver and there is no
| character
| | translation with the parameter true or false. I have enclosed a sample
| piece
| | of code to illustrate the issue. I need to use the parameter as false
| since
| | there is a significant performance hit with it set to true. Also my
| | application requires support for the full Latin1 (Windows 1252)
character
| | set. It looks like the character translation is a result of dropping
the
| | upper byte of the Unicode character which for all characters except
0x80
| -
| | 0x9f is 0x00. If you look at the Windows 1252 character set Unicode
| values
| | you will see that the 0x80 to 0x9f characters have a value in the upper
| byte
| | therefore simply truncating the upper byte produces a character
| translation.
| | Example: Character 0x80 has the Uniccode value of 0x20AC and is
| translated to
| | 0xAC in the database.
| |
| | table1:
| | varchar(300);
| |
| | file: infile.txt is a binary file with the characters bytes 0x20 to 0xff
| |
| | Class.forName("com.microsoft.jdbc.sqlserver.SQLServerDriver");
| | con =
| |
|
DriverManager.getConnection(" jdbc:microsoft:sqlse
rver:// myserver;SelectMetho

|
d=cursor;SendStringP
arametersAsUnicode=f
alse;user=test;passw
ord=test;Databas
| eName=TestDB"
| | );
| |
| | BufferedReader r = new BufferedReader(new FileReader("infile.txt"));
| | String text = r.readLine();
| |
| | PreparedStatement pstmt = null;
| | String sqlQuery = "insert into table1 (text) values (?)";
| |
| | pstmt = con. prepareStatement(sql
Query);
| | pstmt.setString(1, text);
| | pstmt.execute();
| |
| | pstmt.close();
| | pstmt=null;
| | con.close();
| | con=null;
| |
|
|

BirchBarlow

2005-07-08, 1:23 pm

Kamil & MSFT:

Thank you for the analysis and acknowledgement of the issue. I look forward
to a fix.




"Kamil Sykora [MSFT]" wrote:

> I was able to reproduce the problem. Sorry about the confusion.
>
> I filed a bug on this problem.
>
> Thanks,
> Kamil
>
> Kamil Sykora
> Microsoft Developer Support - Web Data
>
> Please reply only to the newsgroups.
> This posting is provided "AS IS" with no warranties, and confers no rights.
>
>
> Are you secure? For information about the Strategic Technology Protection
> Program and to order your FREE Security Tool Kit, please visit
> http://www.microsoft.com/securÂ_ity.
>
>
>
> | From: kamils@online.microsoft.com (Kamil Sykora [MSFT])
> | Organization: Microsoft Developer Support
> | Date: Tue, 24 May 2005 16:12:13 GMT
> | Subject: RE: Translated characters stored in database
> |
> | Hello,
> |
> | I tested your code with this binary data in the infile.txt:
> |
> | 20 21 22 23 24 25 26 27 28 29 30 7B 7C 7D 7E 7F 80 82 83 84 85 86 87 88
> 89
> | 8A C0 C1 C2 C3 C4 C5 F0 F1 F2 F3 FE FF
> |
> | I then insert the data with the code below into the database. The result
> is
> | below. Note that the characters are probably not displaying correctly but
> | the hex values should:
> |
> | select text, cast (text as varbinary(40)) from texttable
> |
> | text
>
> |
>
> |
> |
> ----------------------------------------------------------------------------
> | ------------------------
> |
> ----------------------------------------------------------------------------
> | ------
> | !"#$%&'()0& #123;|}~€‚ƒ„
…â€_‡ˆ‰Å_ÀÃ
ÂÃÄÅðñòóþÿ

>
> |
> |
> 0x202122232425262728
29307B7C7D7E7F808283
8485868788898AC0C1C2
C3C4C5F0F1F2F3FE
> | FF
> |
> |
> | This is what I would expect. Are you seeing different results? If so,
> what
> | data are you seeing?
> |
> | Thanks,
> | Kamil
> |
> | Kamil Sykora
> | Microsoft Developer Support - Web Data
> |
> | Please reply only to the newsgroups.
> | This posting is provided "AS IS" with no warranties, and confers no
> rights.
> |
> |
> | Are you secure? For information about the Strategic Technology
> Protection
> | Program and to order your FREE Security Tool Kit, please visit
> | http://www.microsoft.com/securÂ_ity.
> |
> |
> |
> | --------------------
> | | From: "=?Utf-8?B?QmlyY2hCYXJsb3c=?="
> | < BirchBarlow@discussi
ons.microsoft.com>
> | | Subject: Translated characters stored in database
> | | Date: Sun, 22 May 2005 21:26:07 -0700
> | | Lines: 43
> | |
> | | Is this a known issue? I am using the MS JDBC driver to connect to SQL
> | | Server 2000 SP3a. The driver version is either SP2 or SP3. I tested
> | both
> | | with the same results. The database collation is Latin1. Sun java
> 1.3.1
> | or
> | | 1.4.2. When using the SendStringParameters
AsUnicode=false in the
> | connection
> | | string the driver translates the characters in the 0x80 to 0x9f range.
> | The
> | | translation does not occur when setting the above connection parameter
> to
> | | true. I have also tested the Data Direct driver and there is no
> | character
> | | translation with the parameter true or false. I have enclosed a sample
> | piece
> | | of code to illustrate the issue. I need to use the parameter as false
> | since
> | | there is a significant performance hit with it set to true. Also my
> | | application requires support for the full Latin1 (Windows 1252)
> character
> | | set. It looks like the character translation is a result of dropping
> the
> | | upper byte of the Unicode character which for all characters except
> 0x80
> | -
> | | 0x9f is 0x00. If you look at the Windows 1252 character set Unicode
> | values
> | | you will see that the 0x80 to 0x9f characters have a value in the upper
> | byte
> | | therefore simply truncating the upper byte produces a character
> | translation.
> | | Example: Character 0x80 has the Uniccode value of 0x20AC and is
> | translated to
> | | 0xAC in the database.
> | |
> | | table1:
> | | varchar(300);
> | |
> | | file: infile.txt is a binary file with the characters bytes 0x20 to 0xff
> | |
> | | Class.forName("com.microsoft.jdbc.sqlserver.SQLServerDriver");
> | | con =
> | |
> |
> DriverManager.getConnection(" jdbc:microsoft:sqlse
rver:// myserver;SelectMetho

> |
> d=cursor;SendStringP
arametersAsUnicode=f
alse;user=test;passw
ord=test;Databas
> | eName=TestDB"
> | | );
> | |
> | | BufferedReader r = new BufferedReader(new FileReader("infile.txt"));
> | | String text = r.readLine();
> | |
> | | PreparedStatement pstmt = null;
> | | String sqlQuery = "insert into table1 (text) values (?)";
> | |
> | | pstmt = con. prepareStatement(sql
Query);
> | | pstmt.setString(1, text);
> | | pstmt.execute();
> | |
> | | pstmt.close();
> | | pstmt=null;
> | | con.close();
> | | con=null;
> | |
> |
> |
>
>

Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com