[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Default Pretty Form of DNs



> The 2.0 slapd(8) currently maintains each DN in two forms: user
> and normalized.  The HEAD slapd(8) currently maintains each
> DN in two forms: pretty and normalized.
> 
> User - is what the user (client) provided.
> Pretty - a value-preserved representation of the DN.
> Normalized - a value-modified representation of the DN.
> 
> The primary reason we have a pretty form is to so the server
> can be "liberal in what it accepts but strict in what it
> produces".  That is, we can accept DNs which do not conform
> to RFC 2253 (e.g. LDAPv2 DNs) 

I'd say, right now current HEAD code accepts many DN forms
including DCE

> and "pretty" them into a
> form which does conform to RFC 2253.  When logging, generally
> the pretty DN is used.
> 
> The normalized DNs are maintained to speed up certain
> operations.  However, there are certain cases where
> one must manipulate the pretty form. For example, modrdn.
> So, designing the pretty form such that it aids such
> manipulation is prudent.
> 
> I generally believe that the pretty form should be minimally
> escaped.  That is, only the characters which RFC 2253 requires
> to be escaped should be escaped.  Secondly, the escape
> chararacter '\' should be escaped using hexpair "\5C" instead
> of "\\" to make the DN easier to manipulate.  Lastly, space
> characters which need escaping should be presented in hexpair
> form to conform to the (in error) ABNF in RFC 2253.

I don't have problems with minimal, partial or total escaping; 
my concern is with providing no ambiguity with something that
is as human readable as possible, considering that many terminals
have problems with UTF-8.  However I think this is not an issue
since the decision on what to escape can be delayed until
everything works (I hope this answers Stig's notes)

> 
> And, yes, DN (in any form) and other things containing UTF-8
> may be logged.
> 
> 
> >When is pretty to be used? I thought pretty was for output to
> >humans (and logs), of course it might make sense then as well.
> >My thinking is that we should have a normalization form that is
> >used internally, and easy parsing would make sense there as well.
> >Unless we can parse it only once, and pass the parsed result
> >around.

That's what I'd like to do: a 

typedef struct slap_dn {
	struct berval	dn_raw;
	struct berval	dn_pretty;
	struct berval	dn_normalized;
	LDAPDN		*dn_PRETTY;
	LDAPDN		*dn_NORMALIZED;
} SlapDN_t;

with a set of functions that can perform each operation by filling 
up the required part of structure starting by the most appropriate
member if available, e.g. a dn_normalized can be built from dn_raw
by parsing it or by dn_PRETTY by copying it to dn_NORMALIZED and 
by normalizing the latter.  Then we can pass this structure instead
of two or more bervals as we do now.

> I think the normalized form (used internally), should be
> >UTF-8 (no hex for non-ascii). I see no reason to hex escape non-
> >ascii, only extra work. 

I'm open to this point.

> For output to logs (pretty form?), we
> >should hex escape some non-ascii characters, maybe all. Could
> >you Ando, or someone else?, tell me why you want to do hex-
> >escaping of non-ascii internally? I'm not in doubt what I want,
> >but I could be missing something... (: I really should read that
> >piece of code, but I think I'll do some other stuff and wait
> >till it settles a bit. Looks like three of you are working a lot
> >on this right now.

There's nothing really to read, we can easily turn on or off escaping 
and so.  I currently turned it off for prettying, I can easily turn it
off in any case.  I think we mostly need to focus on deciding when to
use the api :)

Ando.