[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: utf-8 encode




> Juan Miguel de los Ríos Caparrós wrote:
> 
> Where must I indicate to LDAP for using  utf-8 encode or ISO 10646-1?... I´m using OpenLDAP 1.2.10 and Red Hat 6.2

You don't.  OpenLDAP 1.2.x is mostly character set transparent for those
encodings that are ASCII-compatible (i.e. no wide characters and no
encodings that contain NULs and such) so, if you build your directory
using UTF-8, it will be in UTF-8.  From a formal point of view, this
is a violation of the standard.  It should be in T.61 (teletexString)
and nothing else (notice that ISO 8859-1 is no good either).

However, you will have lots of company in this particular violation of
the standard.  E.g. Netscape Communicator will assume the directory
is in UTF-8 by default and I think the default cannot be overridden
in older versions.  MS software seems to implement some heuristic both
in clients and servers that will often settle for UTF-8, but I cannot
provide further help since I simply don't understand what is the
heuristic.

On LDAPv3 (that OpenLDAP 1.2.x does *not* implement) it is UTF-8.
There is an implementation of v3 in the works in the CVS HEAD branch
but you should stay await from it (unless you want to help, of course).

So you might want to plan building your database as UTF-8 now, since
it will simplify migration later and live with the agression to the
standard temporarily.

Notice that neither the API nor the servers will do any kind of
translation whatsoever: if you build with some character set, you will
have to use that character set always.  It is impossible to do this
properly without knowing about the schema, since only some attribute
types should be translated (e.g. you don't want your JPEGs or your
certificates messed with).

Julio