[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: setlocale (ITS#319)

red@skazna.ru writes:

> Hello,
> I today build new OpenLDAP (1.2.7 version) and I see that all programs don`t
> set it`s locale ! So I must edit all main.c files to insert two string
> #include <locale.h>
> ...
> setlocale(LC_ALL,"");
> near main function. Only after this manipulations LDAP correctly works with
> russian names. Programs run not only in America !

We know, we know :-)

If you look at the core developers you will see that there is a
sizeable participation from Europe.

Now, to answer your question, I think doing what you suggest is not
enough and may be very well lead you to a dead end.

Most people agree that the standard says LDAP v2 (what 1.2.7
implements) uses charset T.61 on the wire.  The library will not do
anything but move bits around without translation.  That means that
what you have returned by the API is T.61.  Unless your locale is T.61
(that I very much doubt), that will do you no good.

What you are trying to do is to ignore that your directory is in T.61
and load it with something else instead.  Many of us have done so in
the past and have found incompatibilities with programs that know

Moreover, programs run not only in Western Europe either and T.61 is
grossly inadequate for Arabic, Hebrew, Greek, Russian, Chinese,
Japanese, Korean, etc.  For this reason, LDAP v3 mandates a new
character set: Unicode in the UTF-8 encoding.  As a matter of fact,
some programs support talking UTF-8 on LDAP version 2 (but you should
not, it is against the standard).

So adequate support requires:

	- Identifying whether the platform supports locales
	- Setting the locale if appropriate
	- Detecting if the runtime can translate characters for us
	- Detecting what attribute types are to be translated (we
	  don't want user certificates translated, do we?)
	- Translate values and DNs between different charsets
	- If the platform does not support some or all of the
	  above, provide adequate replacements

The above, for T.61 and UTF-8, defaulting appropriately depending on
the protocol version in use.

This is what has to be done.  Any help is welcome.  But, please, all
new features should be sent as diffs against the HEAD CVS branch.  It
is very unlikely that we can start from anything else.

If you feel like starting contributing changes to set the locale like
you suggest, I think IMHO that they can be integrated (again, diffs
against HEAD).  It will let you do what you want and I think would be
of use later.  All such calls should be guarded by #ifdef
HAVE_LOCALE_H lest it breaks on lesser platforms.

Before I forget, though there is support for running the server in
some locale, I think you should let that aside.  In general, a server
should be able to support clients in several locales at the same time.
That's what Unicode is for, and server-side sorting and the like
should be implemented with the help of controls and matching rules.


P.S. What *I* do is run the pre-2.0 versions (that are under
development) with UTF-8 all around.  My CGIs and Perl programs either
translate on the fly or use charset UTF-8 (it is a matter of sending
the browser the right headers).  Problem is that using ldapsearch and
its friends gets tricky, I use translators from/to UTF-8 for that,
usually as a utf8edit wrapper script.  Sorry, it only does Latin1.