[Date Prev][Date Next] [Chronological] [Thread] [Top]


> > -----Original Message-----
> > From: Norbert Klasen [mailto:norbert.klasen@daasi.de]

> > Improperly - but quite common, at least here in Germany. RFC2459
> > also says
> > that a TeletexString should be interpreted as Latin1:
> >    In addition, many legacy implementations support names encoded in the
> >    ISO 8859-1 character set (Latin1String) but tag them as
> >    TeletexString.  The Latin1String includes characters used in Western
> >    European countries which are not part of the TeletexString charcter
> >    set.  Implementations that process TeletexString SHOULD be prepared
> >    to handle the entire ISO 8859-1 character set.[ISO 8859-1]
> >
> > Is there already a function like ldap_t61s_to_utf8s for latin1?
> > Implementing it shouldn't be much work since each code point in
> Latin1 is
> > the same in Unicode.

I have added ldap_ucs_to_utf8s to handle this. It also handles the ASN.1
BMPString and UniversalString formats as well. Unfortunately the existing
ldap_x_wcs_to_utf8s was not directly usable for this purpose because (a)
there is no guarantee that a wchar_t is big enough to hold 32 bits and (b)
there is no specification of the byte order within a wchar_t. I have tested
the new code with some 8-bit characters, please give it a try.

>   -- Howard Chu
>   Chief Architect, Symas Corp.       Director, Highland Sun
>   http://www.symas.com               http://highlandsun.com/hyc
>   Symas: Premier OpenSource Development and Support