|
Home > Archive > SQL Anywhere ultralite > January 2006 > utf8 collation on asa9 ultralite database - i need some help
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
utf8 collation on asa9 ultralite database - i need some help
|
|
| Michal Seliga 2006-01-16, 1:23 pm |
| i use sybase asa 9.0.2.3221, metrowerks codewarrior 9.3, ultralite static c++ api
since our customers use different charset we use utf8 collation for our test
asa9 project (in asa7 we had more databases with different collations)
but i have some problems with this utf8 database, could someone help me or
provide some background about what is possible and what isn't?
my problems are mainly based on string comparison. when string contains
non-ascii character then it doesn't work.
example
in database i have table customer with field cus_city
cus_city contains value 'HORNÝ SMOKOVEC' in one row
my database is set to be case insensitive so when i launch
select cus_city
from customer
where cus_city like 'horn%'
then it is correctly found
but when i give
select cus_city
from customer
where cus_city like 'horný%'
then it doesn't. does that mean that case (in)sensitivity doesn't work for
extended characters?
when i use it from c++ i convert all text values from utf-8 to locale codepage
and back (using iconv library) so all Gets and Sets from and to database give
correct results, problem is only when such string is used in any comparison (i
convert all comparison parameters to utf8 too)
so, did anyone here tried to use utf8 for ultralite database? if yes and you can
share experience, you may also write mail to my email....
just as background, our customers so far use cyrillic, latin and in future there
may be also arabic charset, but they are never mixed together. what we want to
achieve is one (utf8) database (and palm application) which can be filled with
any data and still work correctly independant on country (and charset) it is
used in. is it possible to use it this way? did someone tried it? were results
acceptable?
thanks in advance
| |
| Michael Thode 2006-01-17, 1:23 pm |
| In UltraLite only single byte characters support case insensitive
comparisons. With the utf8 collation the only characters that have
toupper() and tolower() support are ASCII values (<128).
The result in this case is that you are restricted to only exact string
matches when you have non english characters.
The only work around I can think of for you is to create your own sortkeys.
For example, if you had a name column in a table, you'd add a second column
called name_sortkey. When inserting into name_sortkey you'd convert the
name to upper case first. When I say convert to uppercase I mean use an OS
(or other library) function and not the SQL toupper() function. You can then
do comparisons on the name_sortkey field. You'd also have to convert your
parameters to uppercase as well (eg. like 'HORN%'). When fetching and
displaying the name you'd use the regular name column.
What was your motivation from moving away from the multiple collations you
used with asa7?
Hope that helps,
Mike
"Michal Seliga" <michal.seliga@visicom.sk> wrote in message
news:43cbdfb1@forums
-2-dub...
> i use sybase asa 9.0.2.3221, metrowerks codewarrior 9.3, ultralite static
c++ api
>
> since our customers use different charset we use utf8 collation for our
test
> asa9 project (in asa7 we had more databases with different collations)
>
> but i have some problems with this utf8 database, could someone help me or
> provide some background about what is possible and what isn't?
>
> my problems are mainly based on string comparison. when string contains
> non-ascii character then it doesn't work.
>
> example
>
> in database i have table customer with field cus_city
>
> cus_city contains value 'HORN SMOKOVEC' in one row
>
> my database is set to be case insensitive so when i launch
> select cus_city
> from customer
> where cus_city like 'horn%'
>
> then it is correctly found
>
> but when i give
> select cus_city
> from customer
> where cus_city like 'horn%'
>
> then it doesn't. does that mean that case (in)sensitivity doesn't work for
> extended characters?
>
> when i use it from c++ i convert all text values from utf-8 to locale
codepage
> and back (using iconv library) so all Gets and Sets from and to database
give
> correct results, problem is only when such string is used in any
comparison (i
> convert all comparison parameters to utf8 too)
>
> so, did anyone here tried to use utf8 for ultralite database? if yes and
you can
> share experience, you may also write mail to my email....
>
> just as background, our customers so far use cyrillic, latin and in future
there
> may be also arabic charset, but they are never mixed together. what we
want to
> achieve is one (utf8) database (and palm application) which can be filled
with
> any data and still work correctly independant on country (and charset) it
is
> used in. is it possible to use it this way? did someone tried it? were
results
> acceptable?
>
>
> thanks in advance
| |
| Michal Seliga 2006-01-18, 7:23 am |
| my application is divided to database part library and ui part which are build
separately. in asa7 i had multiple ui versions because of localisations and
multiple dblib versions because of collations. that meant i had to have project
in more databases, each built with different collation and complete building of
project took almost half of hour.
i hoped that with asa9 and utf8 support i will be able to have only one database
library which would speed building time and it will also help for customers who
use more than one language (multinational). but with these limitations it is
completely unusable, because our ultralite database consists of 70 tables and
some of them have 40 columns and your idea would make it too big and difficult
to use. but such existing utf8 support is IMHO completely unusable :-(
Michael Thode wrote:
> In UltraLite only single byte characters support case insensitive
> comparisons. With the utf8 collation the only characters that have
> toupper() and tolower() support are ASCII values (<128).
>
> The result in this case is that you are restricted to only exact string
> matches when you have non english characters.
>
> The only work around I can think of for you is to create your own sortkeys.
> For example, if you had a name column in a table, you'd add a second column
> called name_sortkey. When inserting into name_sortkey you'd convert the
> name to upper case first. When I say convert to uppercase I mean use an OS
> (or other library) function and not the SQL toupper() function. You can then
> do comparisons on the name_sortkey field. You'd also have to convert your
> parameters to uppercase as well (eg. like 'HORNÝ%'). When fetching and
> displaying the name you'd use the regular name column.
>
> What was your motivation from moving away from the multiple collations you
> used with asa7?
>
> Hope that helps,
> Mike
>
> "Michal Seliga" <michal.seliga@visicom.sk> wrote in message
> news:43cbdfb1@forums
-2-dub...
> c++ api
> test
> codepage
> give
> comparison (i
> you can
> there
> want to
> with
> is
> results
>
>
| |
| Michael Thode 2006-01-18, 11:23 am |
| Unfortunately the functionality your requesting, would add significant
overhead to the the size of UltraLite. Unicode collation tables are very
large due the number of characters defined. SQL Anywhere will have the
functionality your requesting in our next major release, but UltraLite will
not.
Mike
"Michal Seliga" <michal.seliga@visicom.sk> wrote in message
news:43ce1126$1@foru
ms-2-dub...
> my application is divided to database part library and ui part which are
build
> separately. in asa7 i had multiple ui versions because of localisations
and
> multiple dblib versions because of collations. that meant i had to have
project
> in more databases, each built with different collation and complete
building of
> project took almost half of hour.
>
> i hoped that with asa9 and utf8 support i will be able to have only one
database
> library which would speed building time and it will also help for
customers who
> use more than one language (multinational). but with these limitations it
is
> completely unusable, because our ultralite database consists of 70 tables
and
> some of them have 40 columns and your idea would make it too big and
difficult[color=dark
red]
> to use. but such existing utf8 support is IMHO completely unusable :-(
>
> Michael Thode wrote:
sortkeys.[color=darkred]
column[color=darkred
]
OS[color=darkred]
then[color=darkred]
your[color=darkred]
you[color=darkred]
static[color=darkred
]
or[color=darkred]
for[color=darkred]
database[color=darkr
ed]
and[color=darkred]
future[color=darkred
]
filled[color=darkred
]
it[color=darkred]
|
|
|
|
|