[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: ldap_explode_dn corrupts UTF-8 encoding (ITS#1890)



Hi,

First of all: thanks for discussing this with me...

On Mon, 17 Jun 2002, Pierangelo Masarati wrote:
> ps@psncc.at writes:
>
> > Well, the code fragment that broke is:
> >
> >         char **exploded_dn, *dn;
> >         LDAP *ld;
> >             LDAPMessage *e;
> >
> >         [snip]
> >
> >         dn = ldap_get_dn(ld, e);
> >         /* explode DN */
> >         exploded_dn = ldap_explode_dn(dn, FALSE);
> >
> >
> > Which is exactly what the man page for ldap_explode_dn suggests. And it is
> > straightforward too.
>
> I see a  /* deprecated */

In the .h file, but not in the documentation (man page), but I am picky
;-)

>
> LDAP_F( char ** )
> ldap_explode_dn LDAP_P(( /* deprecated */
>        LDAP_CONST char *dn,
>        int notypes ));
>
> in ldap.h; this is one of the reasons.
>
>
> >
> >> They are; but they're represented in another form that is allowed
> >> for DNs; it depends on whether you like it or not.  I understand
> >
> > I just think it is not good to break existing functionality.
>
> Agree, but up to a point: deprecation and obsoletion
> at some point of standard track (and software) lifetime
> may occur.

Agreed, but _both_ deprecating/obsoleting and breaking functionality in
one go is not very wise. It just disappoints people. In that case I would
rather suggest to just drop functionality alltogether.

> >> You may use:
> >>
> >>    int i;
> >>    LDAPDN *dn;
> >>    char **v = 0;
> >>
> >>    ldap_str2dn( string, &dn, LDAP_DN_FORMAT_LDAP);
> >>    for ( i = 0; dn[i]; i++ ) {
> >>            v = realloc( v, i + 2 );
> >>            ldap_rdn2str( dn[ 0 ][ i ], &v[ i ],
> >>                    LDAP_DN_FORMAT_LDAPV3 | LDAP_DN_PRETTY );
> >>    }
> >>

I have to say that ldap_str2dn has one of the most incomprehensible
interfaces I know. Without the (as of yet untested [by me]) code fragement
above, I would not be able to make sense of the man page in a limited
time (and I know my C quite well):

       typedef struct ldap_ava {
           char *la_attr;
           struct berval *la_value;
           unsigned la_flags;
       } LDAPAVA;

       typedef LDAPAVA** LDAPRDN;
       typedef LDAPRDN** LDAPDN;

with ldap_str2dn taking a LDAPDN** !!

There are quite some indirections here... but maybe this is just my buggy
mind.

> > That code looks a lot more complex and incomprehesible than the
> > straightforward code fragment above... :-(
>
> well, it's not so incomprehensible: you get a dn, and then
> rewrite each part in a readable form, choosing whatever
> format you need.  Or you use ldap_explode_dn, accepting
> the format that's given back.
>
> > It is mostly a matter of breaking things that used to work.
>
> I have no problems in letting ldap_explode_dn/rdn return
> a "pretty" (UTF-8) form of the dn: it's not a big deal; however
> I'd like to see some consensus on that, which is difficult
> to gather on obsoleted function calls: everybody would like
> to keep them as they were, regardless of incompatibilities
> or problems that might arise if fancy formats are used.

Well, UTF-8 is not fancy. All other information in LDAP is in UTF-8, or do
I have to await backslash-hex encoded characters everywhere (I know, I
should read all LDAP RFCs, but I seem to not get around to do it.)

I will try the above code fragment and a configure test, but I am not
happy about it.

ps