[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: ldap_explode_dn corrupts UTF-8 encoding (ITS#1890)

To: openldap-its@OpenLDAP.org
Subject: Re: ldap_explode_dn corrupts UTF-8 encoding (ITS#1890)
From: ps@psncc.at
Date: Mon, 17 Jun 2002 08:34:02 GMT

On Mon, 17 Jun 2002, Kurt D. Zeilenga wrote:

> Date: Mon, 17 Jun 2002 01:13:04 -0700
> From: Kurt D. Zeilenga <Kurt@OpenLDAP.org>
> To: ps@psncc.at
> Cc: openldap-its@OpenLDAP.org
> Subject: Re: ldap_explode_dn corrupts UTF-8 encoding (ITS#1890)
>
> At 01:00 AM 2002-06-17, ps@psncc.at wrote:
> >On Mon, 17 Jun 2002, Pierangelo Masarati wrote:
> >
> >> > OpenLDAP 2.1.2 seems to currupt non-ASCII UTF-8 encoded characters.
> >> > It actually turns unprintable chars (in the ASCII sense) into \<hexcode>.
> >>
> >> I think this is a leftover of when we decided to use UTF8 instead
> >> of the '\' + HEXPAIR representation of non-ascii chars, and initially
> >> it was intended; of course, when parsing a DN, one wants the correct
> >> UTF8 encoding.
> >
> >Note that the problem does not exist in 2.0.23...
>
> Difference is within specification and necessary to address
> other issues.
>
> >To further elaborate the problem: before passing the DN to the
> >ldap_explode_dn function it is properly (UTF-8) encoded. Afterwards the DN
> >parts aren't...
>
> Hex pairs can appear in properly encoded DN strings.  See RFC 2253.
>

OK, I'll buy that, a very short glance on the RFC makes me believing it.
The problem is: Hex encoding is IMHO stupid^H^H^H^H^H^Hnot wise to use if
one has to deal with UTF-8 anyway. UTF-8 can be used easily using eg. the
iconv_* stuff. Having to check for hexpairs adds yet another layer to the
processing and definitly offers no speed increase ;-)

_And_ OpenLDAP 2.1.2 breaks the functionality found in 2.0.*. I understand
the problem to be more about compatibility than about it being a real bug
(though I first believed it to be one).

ps

Prev by Date: Re: ldap_explode_dn corrupts UTF-8 encoding (ITS#1890)
Next by Date: Re: ldap_explode_dn corrupts UTF-8 encoding (ITS#1890)
Index(es):
- Chronological
- Thread