[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: LDAPprep: mapping of " " values

To: Steven Legg <steven.legg@eb2bcom.com>
Subject: Re: LDAPprep: mapping of " " values
From: "Kurt D. Zeilenga" <Kurt@OpenLDAP.org>
Date: Tue, 16 Nov 2004 02:35:57 -0800
Cc: ietf-ldapbis@OpenLDAP.org
In-reply-to: <4199A06D.3030302@eb2bcom.com>
References: <6.1.2.0.0.20041114173514.02e0cd80@127.0.0.1> <4199A06D.3030302@eb2bcom.com>

At 10:38 PM 11/15/2004, Steven Legg wrote:
>Kurt,
>Kurt D. Zeilenga wrote:
>>Steven suggested that changing LDAPprep such that string
>>comprising only of whitespace would be mapped to "" instead
>>of " ".  I believe a poor approach for a number reasons.
>>While it can be argued (as Steven has) that such mapping
>>may make some assertions more intuitive, I argue that such
>>mapping will make various assertions less intuitive.
>>More importantly, the assertion (l=* *), which says "match a
>>significant space in values of l", would no longer behave
>>properly.  The " " ANY string would be mapped to "", leading
>>to (l=* *) matching any value instead of only those values
>>which contained a significant space.  This would likely
>>break a number of applications.
>
>I agree that matching everything in such a case is excessive.
>
>>It's my view that the assertion (l= *) says  "match a
>>significant leading space in values of l".  These assertions
>>intuitively should only match strings which are all whitespace,
>>as leading whitespace is otherwise insignificant.
>
>Wouldn't it be easier to just say (l= ) ?

Yes, but X.520 allows (l= *) instead.  What does your
implementation do today?  In OpenLDAP, this assertion
will only match values which are composed entirely
of whitespace.  Others?

The logic here is that, except in one special case, that
leading and trailing spaces are insignificant.  One cannot
match on insignificant portions of the value without giving
them significance.  And giving leading and trailing spaces
significance changes the character of the rules in a major
way.

>> Likewise
>>for (l=* ).  Note that this behavior is actually useful.  One
>>can assert (!(l= *))
>
>or (!(l= ))
>
>> to match all values which are not
>>all whitespace.  Having (l= * * ) behave like (l=*)
>>substracts value (and likely will break applications, see
>>above).
>
>In the current specifications (l= * * ) will never match anything!

I believe that this is correct.  As the old adage goes:
ask a stupid question, get a stupid answer.

>A value can only match (l= *) or (l=* ) if it is all whitespace.

I believe that this is correct.  As every string ("X") is equivalent
to some string which has insignificant leading and trailing
whitespace (" X "), these assertions would match the same
entries as (l=*).   The client should simply do (l=*) if that
what it wants. 

>If it is all whitespace then LDAPprep reduces it to a single space.
>A single space cannot simultaneuously satisfy the initial, any and
>final substrings.

I believe that this is proper as there is only one significant
space and the assertion asked whether there is three significant
spaces.

>What we are running into I think is the problem that whitespace in
>different parts of an attribute value are treated differently, but the
>whitespace in each substring of a substring assertion is treated the
>same. Intuitively, one might expect that (l= * * ) should match a
>value like "  foo  bar  ".  It doesn't with the current specifications.
>It would if whitespace were reduced to nothing, but it would match everything else as well.

If intuitively one might expect this, then they might also
expect (l=* * *) to match "x  x" (or (l=*  *) to match "x  x"
but not "x x").  If one can match insignificant leading and
trailing spaces, then it intuitively follows one can match
insignificant consecutive spaces.

I believe that this is nonsense and that we should redesign
matching to support matching of insignificant spaces.

>What we seem to need here is for leading whitespace in the initial substring
>and trailing whitespace in the final substring to be reduced to nothing,
>while every other sequence of whitespace characters, in the initial, any or
>final substring, reduces to a single space.

If there is a need to match insignificant spaces, a rule which
is specifically design to support that matching should be used.
These rules were designed to ignore insignificant spaces.  We
should not change that.

>It would be a modest change to LDAPprep

What you ask for, IMO, is a change to matching rule to support
matching of insignificant spaces in certain cases.  I believe
that such a change is inappropriate and certainly should be
viewed as a new feature.

> to enable something like this.
>We just need two parameters for each string handed to LDAPprep: a boolean
>flag that indicates whether whitespace in the initial part of the string
>is to be treated as leading whitespace, and a boolean flag that indicates
>whether whitespace in the final part of the string is to be treated as
>trailing whitespace. The syntaxes draft can then nominate values for the
>flags for each string or substring it passes to LDAPprep. Alternatively,
>LDAPprep can just reduce consecutive whitespace to a single space in every
>case and leave the syntaxes draft to nominate the circumstances under
>which a leading or trailing space is to be removed.
>
>>Additionally, I believe it important that all outputs of
>>LDAPprep would not be valid per the syntax of the input.
>>If this is not so, then implementations must be very
>>careful not to apply LDAPprep to the output of LDAPprep.
>>Also, LDAPprep could not be used as a canonicalization
>>function if we were to adopt this mapping.
>
>In the wider context of component matching (and potentially even within the
>framework of X.500) there are many ways that the output of LDAPprep could be
>invalid with respect to the syntax, i.e. ASN.1 type, of the abtract value that
>supplied the input string. It can change the length of the string such that it
>is no longer an acceptable length - too short, too long (?) or an
>explicitly disallowed length. It can introduce space characters where space
>characters are disallowed. It can create a sequence of characters that
>no longer satisfies a pattern constraint or value constraint. And so on.
>And what exactly is the output syntax of LDAPprep in ASN.1 terms ?
>A UTF8String ? A UniversalString ? That clearly doesn't line up with
>an input that is a TeletexString.
>
>LDAPprep is only used within the LDAP technical specification to prepare
>character strings for a comparison routine. It is an internal part of
>of a function that accepts two values and produces TRUE, FALSE or Undefined
>as a result. If someone wants to use it for something else, like canonicalization
>then they have to deal with the consequences, which are far more involved than
>dealing with empty strings.
>
>Regards,
>Steven
>
>>Kurt
>>  
>>

Follow-Ups:
- Re: LDAPprep: mapping of " " values
  - From: Steven Legg <steven.legg@eb2bcom.com>

References:
- LDAPprep: mapping of " " values
  - From: "Kurt D. Zeilenga" <Kurt@OpenLDAP.org>
- Re: LDAPprep: mapping of " " values
  - From: Steven Legg <steven.legg@eb2bcom.com>

Prev by Date: Re: LDAPprep: mapping of " " values
Next by Date: LDAP filter question
Index(es):
- Chronological
- Thread