[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Default Pretty Form of DNs



The 2.0 slapd(8) currently maintains each DN in two forms: user
and normalized.  The HEAD slapd(8) currently maintains each
DN in two forms: pretty and normalized.

User - is what the user (client) provided.
Pretty - a value-preserved representation of the DN.
Normalized - a value-modified representation of the DN.

The primary reason we have a pretty form is to so the server
can be "liberal in what it accepts but strict in what it
produces".  That is, we can accept DNs which do not conform
to RFC 2253 (e.g. LDAPv2 DNs) and "pretty" them into a
form which does conform to RFC 2253.  When logging, generally
the pretty DN is used.

The normalized DNs are maintained to speed up certain
operations.  However, there are certain cases where
one must manipulate the pretty form. For example, modrdn.
So, designing the pretty form such that it aids such
manipulation is prudent.

I generally believe that the pretty form should be minimally
escaped.  That is, only the characters which RFC 2253 requires
to be escaped should be escaped.  Secondly, the escape
chararacter '\' should be escaped using hexpair "\5C" instead
of "\\" to make the DN easier to manipulate.  Lastly, space
characters which need escaping should be presented in hexpair
form to conform to the (in error) ABNF in RFC 2253.

And, yes, DN (in any form) and other things containing UTF-8
may be logged.


At 02:33 PM 2001-12-24, Stig Venaas wrote:
>On Mon, Dec 24, 2001 at 09:54:47AM -0800, Kurt D. Zeilenga wrote:
>> I prefer the hexpair form as it makes a number of DN string
>> manipulations much easier.  For example, issuffix needs to
>> check that it's splitting at a separator.  If hexpair's are
>> used, then it just needs to check for ',' at the split.
>> But if "\," and '\\' are allowed to appear, one needs to
>> do more checks.  As a compromise, I'd be happy if "\5C" was
>> used instead of "\\".
>
>When is pretty to be used? I thought pretty was for output to
>humans (and logs), of course it might make sense then as well.
>My thinking is that we should have a normalization form that is
>used internally, and easy parsing would make sense there as well.
>Unless we can parse it only once, and pass the parsed result
>around. I think the normalized form (used internally), should be
>UTF-8 (no hex for non-ascii). I see no reason to hex escape non-
>ascii, only extra work. For output to logs (pretty form?), we
>should hex escape some non-ascii characters, maybe all. Could
>you Ando, or someone else?, tell me why you want to do hex-
>escaping of non-ascii internally? I'm not in doubt what I want,
>but I could be missing something... (: I really should read that
>piece of code, but I think I'll do some other stuff and wait
>till it settles a bit. Looks like three of you are working a lot
>on this right now.
>
>Merry X-mas (or maybe more PC, Seasons's Greetings),
>
>Stig