[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: openldap 2.0.18 + chinese characters (Big5 or GB)



On Tue, 30 Oct 2001, Kurt D. Zeilenga wrote:

> At 07:05 PM 2001-10-30, KH Lau wrote:
> >Is it possible to add entry with Chinese Characters (Big5 or GB) using ldapadd ?
>
> Unless you define a syntaxes which allows such encodings, no.
> Values of directoryString syntax in LDAPv3 are restricted to
> ISO 10646-1 characters encoded using UTF-8.

hrm - to convert from big5 or GB to UTF8?  There's lots of apps to do
that.  If you're under linux, use xemacs-mule and save as UTF...

(is there a FAQ reference on languages on OpenLDAP site? - it'd be in the
user section.  I've got no idea how that FAQ system works so... :)
something like:

Q: I use multibyte encoding in my data.  Is there a way to use this in
OpenLDAP?

A: Yes if you convert it to UTF8 encoding from ISO-10646-1 as per LDAPv3
specification which on some platforms is referred to as unicode, although
this is a misnomer.

I believe this is an LDAP issue not just an OpenLDAP btw...  although it
would be good manners to have a reference on the OpenLDAP FAQ site and not
just force one to read lots of RFCs hoping to make heads or tales...

> >The character followed by the language tag "lang-big5" is a Big5 Chinese Character.
>
> Language tags indicate language not character set.  lang-cn
> would be more appropriate.

lang-zh-big5  (Big5 encoding)
lang-zh-gb*   (GB encoding sets)
lang-zh       (UTF8 I think)
(or lang-zh; lang-zh-cn; lang-zh-hk, lang-zh-tw for 'Chinese', China, Hong
Kong, and Taiwan)...  *heh*.  Lots of splits there - TW and HK tends to
Big5/UTF8 though.
But I'm being pedantic.  lang-cn is someplace else (European IIRC :).

Standard UTF8 and not Java right? :)  (Java has a couple extensions - none
of which matter)  (don't answer please :)

G'day, eh? :)
	- Teunis Peters