[Date Prev][Date Next]
Re: commit: ldap/libraries/libldap getdn.c
Stig Venaas wrote:
> On Wed, Dec 05, 2001 at 11:01:17AM +0100, Pierangelo Masarati wrote:
> > I think this will go in between ldap_str2dn and ldap_dn2str inside
> > the new dnNormalize; this should also be done selectively on the
> > values of the attributes whose syntax allows UTF-8 data.
> I must confess I haven't looked at your code, but I think that in all
> cases where you consider casefolding (uppercasing), you should think
> about Unicode. Of course if you know that some string (or part of
> string) is plain ASCII you can ignore Unicode there.
> > We cannot work at the string level because all UTF-8 that is not
> > plain ascii is already represented as '\' + HEXPAIR; my guess is
> Yes, so either we need to normalize it before it is escaped or we
> need to actually have something that reads in hex like this, and
> outputs new hex values. To me it sounds reasonable to do it before
> it is escaped, but I haven't looked at your code...
> > we need to implement the schema aware dnNormalize to have UTF-8
> > normalization in place in an efficient manner, although we have
> > to deal with the overhead of finding the AttributeDescription of
> > each ava in the LDAPDN structure. To this purpose we could store
> > it in the LDAPAVA as well, possibly only if inside the server and
> > if explicitly required by a flag. Sort of:
> I need to look at your code, but are you (or should we) perhaps
> internally store the dn as a list of rdns, and have pointers to
> ldap_ava structs in there? And then only translate the dn into a
> string when necessary? The translated value can be stored/cached
> somewhere and reused.
I think this is the direction; for now, we are able
to translate strings in structural representations
and vice versa; we use it to do validate/normalize/pretty
every time it is required.
At present, all of these functions are a mere check
that the sequence of operations succeeds; the actual
normalization (which will not mean just uppercasing
any more, I guess) will be done inside dnNormalize
between the two operations.
The hack I introduced last night is a mere uppercase
AFTER the sequence of operations succeeded. This means
that unicode is not affected, as it is already
in '\' + HEXPAIR (and thus is NOT normalized!) This,
of course is a flaw we cannot accept; we have to deal
with it until I work the schema-aware normalization out.
In my opinion we should pass the structural representation
AND the NORMALIZED string representation everywhere (maybe
also the PRETTY representation?) possibly with a state
flag so that every time a particular representation is
required it can be generated if not available yet; this
avoids unnecessary conversions.
Dr. Pierangelo Masarati | voice: +39 02 2399 8309
Dip. Ing. Aerospaziale | fax: +39 02 2399 8334
Politecnico di Milano |
via La Masa 34, 20156 Milano, Italy |