[Date Prev][Date Next]
Re: Characters in DN
On Wednesday 11 July 2001 02:13 am, Pierangelo Masarati wrote:
> "David A. Cooper" wrote:
> > OK, I have now integrated my versions of the dn_validate and dn_normalize
> > functions into current development branch code and have posted the new
> > patch file to http://csrc.nist.gov/pki/testing/openLDAP_contrib.html.
> > Feel free to check it out and, if you think it is appropriate, to commit
> > the changes.
> Your code looks ok. You should really submit an ITS
> so we can keep track of the changes.
OK, I'll go to the OpenLDAP Web site and submt an ITS.
> I have only one question. You treat '=' and '#' as
> characters that need to be escaped. While rfc 2253
> says implementations may escape other characters,
> it doesn't require them to be treated as special except
> in type/value separation (=) and beginning of string (#).
> I think you should handle them differently.
Actually, RFC 2253 isn't entirely clear on this issue. In section 2.4 it
states that '=' and '#' (except at the beginning of a string) do not need
to be escaped. However, the BNF in section 3 states that any character other
than a stringchar must be escaped, where:
stringchar = <any character except one of special, "\" or QUOTATION >
special = "," / "=" / "+" / "<" / ">" / "#" / ";"
QUOTATION = <the ASCII double quotation mark character '"' decimal 34>
In my code, I compromised. The dn_validate/dn_normalize functions will
take as input DNs that contain unescaped (=)'s and (#)'s, but in the
normalized form they are escaped. So for example, the input "cn= ===###" is
accepted, but the output is "CN=\=\=\=\#\#\#".
I see that the BNF in draft-ietf-ldapbis-dn-05.txt is different. It defines
stringchar = <any UTF-8 character (can be multiple octets)
except one of escaped or ESC>
escaped = "," / "+" / """ / "<" / ">" / ";"
So, while the BNF of draft-ietf-ldapbis-dn-05.txt is clear that '=' and '#'
do not need to be escaped, the BNF in RFC 2253 suggests otherwise (in
contradiction to the text of RFC 2253).
To some degree, the issue is somewhat academic though. The code will accept
DNs with unescaped '=' and '#' and the normalized versions of the DNs, with
the '=' and '#' are definitely compliant with RFC 2253. The only question is
whether a few bytes are being wasted by including escape characters where
they may not be absolutely necessary.
However, if people feel that this is important, I'll look into changing the
code to avoid escaping '=' and '#' (unless the '#' is the first character in
> Someone who's directly involved in unicode stuff
> should check the UTF part before anything is added.
> I'm totally stuck with it at present (see why we need
That would be helpful. Originally, Stig Venås incorporated unicode
normalization into the dnNormalize function by making a call to
UTF8normalize. I used the UTF8normalize function as a guide in order to
directly incorporate calls to uccanondecomp and uccanoncomp into my code. I
tried it out with a few simple examples, and the code seems to be handling
unicode normalization properly, but it would certainly be helpful to have
others test it out as well.