[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: dnValidate (was: Re: UTF8 case insensitive matching)

To: "David A. Cooper" <david.cooper@nist.gov>
Subject: Re: dnValidate (was: Re: UTF8 case insensitive matching)
From: "Kurt D. Zeilenga" <Kurt@OpenLDAP.org>
Date: Fri, 22 Dec 2000 17:15:51 -0800
Cc: openldap-devel@OpenLDAP.org
In-reply-to: <4.2.2.20001222111658.00acdad0@email.nist.gov>
References: <5.0.0.25.0.20001031104352.0272cad0@router.boolean.net> <4.2.2.20001031125512.00a63870@email.nist.gov>

Just a few random comments:

At 01:27 PM 12/22/00 -0500, David A. Cooper wrote:
>Based on earlier discussions, I have been working on a version of dnValidate that will read in a distinguished name that was generated based on RFC 2253 (or RFC 1779) and will return a normalized version of that string (compliant with RFC 2253).

This is, in general, a good thing.

In fact, I would prefer that such a routine be provided within the
client library as client need to normalize user input.  That is,
often users type in DNs which don't strictly conform to RFC2253,
Section 2 (but use some RFC 1779 style variants).

>The code that I produced will handle attribute values of type Directory string whether they are provided as a string, a quoted string, or a BER encoded string.

When dealing with BER encoded directoryStrings, I would suggest
limiting yourself to universalString (UCS-4), printableString
(subset of IA5), and utf8String (UTF-8) choices of the
directoryString syntax.  That is, I would waste time dealing
with teletexString (T.61) strings.

In addition, I suggest you do not muck with any attribute
value of an attribute type not listed in the RFC 2253,
Section 2.3 table.  This means you only have to deal with
the directoryString and IA5String BER encodings.

>It can similarly handle attribute values of type bitstring.

As there is no type listed in the type which has this syntax,
the type *should* be listed by OID and the value BER encoded.
However, as noted above, don't muck with these.

>Unlike the dnValidate function currently in servers/slapd/schema_init.c, I have added two additional parameters: make_uppercase and compress_whitespace.

For the client side routine, the user provided values should not
be mucked with.

On the server side routine, we have to be careful with when and
where we muck as in general a directory server should not muck
with user data.  That is, if a user provides a goofy looking
DN as a value of say a 'member' attribute type, the server should
provide the value back to the user when later requested.

However, a DN is a complex attribute syntax.  It is valid for
a server to convert the DN string on input to BER form and then
produce a string generated from the BER form upon request.  This
conversion can be done upfront or as needed.

So, it would be okay for the server to "normalize" the DN string
representation as long as it does not alter the user data contained
in the representation.  That is, the server can "pretty" the DN
(replace RFC 1779isms with RFC2253ims including value escaping),
but it cannot alter any assertion value.  No leading, trailing, consecutive
space removal, no upper (or lower) casing, etc.  Such "normalization"
should only be done during DN matching.

That is:
        cn = " foo "; o=bar

can be "prettied" to:
        CN=\20foo\20,o=bar

as such a change preserves the assertion values.

>If make_uppercase is set, then all of the characters in the string are made uppercase (using uctoupper), otherwise the cases of the characters in the attribute values are left unchanged.
>
>If compress_whitespace is set, then all leading and trailing whitespace characters are removed from attribute values and sequences of whitespace characters between "words" in an attribute value are replaced by a single space. If compress_whitespace is not set, then only those leading and trailing whitespace characters that are not considered to be part of the attribute value according to RFC 2253 are removed. For example, if compress_whitespace is set, the string 'cn =  \20 David  Cooper \20  ' would be compressed to 'cn=David Cooper', whereas it would become 'cn=\20 David  Cooper \20' if compress_whitespace were not set.

Such a change alters the assertion value and should not be done
as part of prettying the DN (but is done for matching).

>At the moment, I have not attempted to merge my version of dnValidate into the slapd code, but I have written the code in a way that should make this relatively straightforward. In the meantime, I have written a very short program that reads in a DN from the command line and prints out its normalized form.
>
>If you would like to see what I have done, I have posted the code on our Web site as a compressed tar file (about 10 Kbytes) at:
>
>                 http://csrc.nist.gov/pki/testing/dnValidate_test.tgz
>
>For my own purposes, for the short term, my plan is to re-write dn_validate and dn_normalize as functions that call my version of dnValidate and then overwrite the original string with the string returned by dnValidate (if the normalized string will fit).
>
>If anyone has any ideas on a cleaner long-term solution to integrating this function (as Kurt suggested below) or any other comments on this code, please let me know.
>
>Thanks,
>David Cooper
>
>At 11:38 AM 10/31/00 -0800, Kurt D. Zeilenga wrote:
>>There are a number of additional DN issues which need to be addressed.
>>LDAPv2 and LDAPv3 have different DN encoding requirements.  Though the
>>LDAPv3 DN form (RFC 2253) can be viewed as a subset of the LDAPv2
>>specification (RFC 1779) form, but there are other requirements
>>(e.g.: LDAPv2 restricts an LDAPDN to IA5, LDAPv3 restricts to UTF-8).
>>
>>When talking LDAPv2, a server must accept and produce RFC 1779 DNs.
>>When talking LDAPv3, a server must accept and produce the restricted
>>  DN form defined RFC 2253s.
>>
>>I suggest we allow the more liberal RFC1779 in both LDAPv2 and LDAPv3
>>but only store (and produce) DNs in the RFC 2253 restricted form for
>>both LDAPv2 and LDAPv3.  I also suggest we ignore the LDAPv2 IA5
>>restriction.
>>
>>This implies that we not only validation and normalization functions,
>>but a DN "pretty" function.  To "pretty" the DN, the DN would be parsed
>>per RFC 1779 and then rebuilt per RFC 2253, Section 2.  We'd avoid
>>unnecessary escaping, use the hexpair escaping form verses the escape
>>prefix form, avoid OIDs, avoid BER encoded values, etc.  To normalize,
>>we'd parse per RFC 1779, normalize the value per its syntax, then
>>rebuild per RFC 2253.
>>
>> >Ideally, I would like to fix the dn_validate() function so that all three of these strings normalize to the same result. While I don't think that I'll be able to fix things so that any arbitrary DN can be normalized, I would like to get as close as possible.
>>
>>Note that we don't have to implement all possible DN forms.  A number of the
>>forms we may disallow.  It particular, we may be quite selective of what values
>>we accept in BER form.
>>
>> >One problem that I have, though, is that since DNs must currently be normalized in place,
>>
>>The long term approach is to replace dn_validate/dn_normalize with
>>dnValidate/dnNormalize/dnPretty.  This resolves the in place issue.
>>
>> >Similarly, if I always normalized to a quoted representation,
>>
>>Note that we must not produce the quoted representation in LDAPv3.  It's only
>>allowed in LDAPv2.
>>
>> >If an alternative solution (i.e., one that allows normalization to increase the length of a DN) will be available, then I will abandon my current approach and wait until a cleaner solution can be implemented.
>>
>>We need to shift to using dnValidate/dnNormalize/dnPretty....
>>This is significant work.

Follow-Ups:
- Re: dnValidate (was: Re: UTF8 case insensitive matching)
  - From: "David A. Cooper" <david.cooper@nist.gov>

References:
- dnValidate (was: Re: UTF8 case insensitive matching)
  - From: "David A. Cooper" <david.cooper@nist.gov>

Prev by Date: Re: dnValidate (was: Re: UTF8 case insensitive matching)
Next by Date: passwd backend: it doesn't work for me without this [PATCH]
Index(es):
- Chronological
- Thread