[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: asian characters handling in 2.0.27 -> 2.3.x upgrade

At 01:14 AM 2/16/2006, ST Wong \(ITSC\) wrote:
>Hi all,
>I've problem when upgrading my openldap 2.0.27 to 2.3.19.    Some
>attributes (directoryString) contains encoded data Asian data (e.g.
>Big5, GB, Shift-JIS, etc.)

Values of directoryStrings, in LDAP, are to be UTF-8 encoded

>which can't be imported to 2.3.19 using
>slapadd.  I've to convert them into UTF-8 in order to import these data
>into 2.3.19.

As noted previously on this list, we've improved slapd(8)
schema checking over the years.   Previous (or any current)
failure of slapd(8) to enforce any particular protocol
restriction should not be viewed as a license for a client
to not itself adhere to the protocol.

>I'd like to know if there is any workaround to import
>different Asian data without converting to UTF-8, or if there is any
>better alternate solution available.

You have three basic choices:
 1) Use UTF-8 encoded Unicode on the wire, transliterate
    as needed on the client.
 2) Use custom attributes with octet string (instead of character
    string) based syntax and matching rules. 
 3) Use custom attributes with custom character string syntax
    and matching rules.  Requires server-side support.

There are some obvious and some non-obvious trade-offs between
these choices.  Though some of the implementation details
(like how to provide server-side support for additional
syntaxes and matching rules) are server-dependent, the issues
involved in the trade-off are generally server-independent.
Hence, further discussion of these issues and trade-offs
should be taken to a general LDAP list such as <ldap@umich.edu>.

>Would anyone please help?
>Thanks a lot.
>ST Wong (st-wong@cuhk.edu.hk)