[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: dnValidate (was: Re: UTF8 case insensitive matching)

To: Stig Venås <venaas@alfa.itea.ntnu.no>
Subject: Re: dnValidate (was: Re: UTF8 case insensitive matching)
From: "David A. Cooper" <david.cooper@nist.gov>
Date: Fri, 22 Dec 2000 15:18:04 -0500
Cc: openldap-devel@OpenLDAP.org
In-reply-to: <20001222205337.A790@itea.ntnu.no>
References: <4.2.2.20001222111658.00acdad0@email.nist.gov> <4.2.2.20001031125512.00a63870@email.nist.gov> <5.0.0.25.0.20001031104352.0272cad0@router.boolean.net> <4.2.2.20001222111658.00acdad0@email.nist.gov>

At 08:53 PM 12/22/00 +0100, Stig Venås wrote:
>On Fri, Dec 22, 2000 at 01:27:55PM -0500, David A. Cooper wrote:
> > 
> > Based on earlier discussions, I have been working on a version of dnValidate that will read in a distinguished name that was generated based on RFC 2253 (or RFC 1779) and will return a normalized version of that string (compliant with RFC 2253).
>
>I started to look into this as well, but have been working mostly on Unicode
>normalization. I probably have some code ready for use in 2-3 weeks. When
>normalizing the dn, we should also do Unicode normalization I think.

I'm not sure what you mean here. According to RFC 2253, the string representation of an attribute value must be a UTF-8 string. So, the code that I wrote reads in characters, one at a time, converts them to unicode as is necessary to call uctoupper, and then converts the result back to UTF-8 to place in the normalized string. Did you have something else in mind?

> > The code that I produced will handle attribute values of type Directory string whether they are provided as a string, a quoted string, or a BER encoded string. It can similarly handle attribute values of type bitstring.
> > 
> > Unlike the dnValidate function currently in servers/slapd/schema_init.c, I have added two additional parameters: make_uppercase and compress_whitespace.
> > 
> > If make_uppercase is set, then all of the characters in the string are made uppercase (using uctoupper), otherwise the cases of the characters in the attribute values are left unchanged.
>
>I know that uppercasing is used now, but I'm wondering if it would be
>better to do lowercasing. In most cases there are mostly lower case
>characters to begin with (I think).

I don't think there would be problem going either way. One could just replace TOUPPER and uctoupper with TOLOWER and uctolower.

> > For my own purposes, for the short term, my plan is to re-write dn_validate and dn_normalize as functions that call my version of dnValidate and then overwrite the original string with the string returned by dnValidate (if the normalized string will fit).
>
>Do you plan to look at the case where it won't fit later?

The nice thing about simply overwriting the original string in dn_validate and dn_normalize is that it involves only a local change to the code. In order to do things properly, all of the calls to these functions would need to be changed. While that may or may not be difficult, I am not very familiar with the code and would be concerned about the possibility of introducing bugs if I tried to do it myself. So, I guess the short answer is that I was hoping that someone more familiar with the code would do it.

Dave

Follow-Ups:
- Re: dnValidate (was: Re: UTF8 case insensitive matching)
  - From: Stig Venås <venaas@alfa.itea.ntnu.no>

References:
- dnValidate (was: Re: UTF8 case insensitive matching)
  - From: "David A. Cooper" <david.cooper@nist.gov>
- Re: dnValidate (was: Re: UTF8 case insensitive matching)
  - From: Stig Venås <venaas@alfa.itea.ntnu.no>

Prev by Date: Re: dnValidate (was: Re: UTF8 case insensitive matching)
Next by Date: Re: dnValidate (was: Re: UTF8 case insensitive matching)
Index(es):
- Chronological
- Thread