[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: multi-byte character sets



At 01:07 PM 10/21/99 -0400, Gary Williams wrote:
>I'm pretty well-versed in LDAP technology, but I'm lost when it
>comes to international character set issues.  I know with V3,
>strings are supposed to be UTF-8 encoded.  But that's only
>an encoding.  Does it specify what character set?

LDAPv3 strings uses the ISO 10646 charset, a superset
of unicode, encoded using UTF-8.

LDAPv2 strings use T.61.

In both protocols, certain strings (such as attribute types)
are limited to a subset of ASCII.

>If I have
>a client that's using a multi-byte character set and I UTF-8
>encode it before passing it to the API, have I fulfilled all
>the requirements?

No.

>Or does the string first have to be
>translated to a standard character set (UNICODE?) and then
>UTF-8 encoded?

Yes. ISO 10646 for LDAPv3.  T.61 for LDAPv2.



----
Kurt D. Zeilenga <Kurt@OpenLDAP.org>
OpenLDAP Project <http://www.OpenLDAP.org/>