[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: LDAPprep: mapping of " " values






I've been trying to think through this and some of the other examples
presented regarding spaces.

I suggest that LDAPprep:
- optionally(?) collapse adjacent spaces into a single space
- state that empty string in = empty string out

I assume we can sort out the combining character issues, BiDi, etc. I am
way out of my depth there.

I further suggest that the matching rules define the handling of leading
and trailing spaces in both the entry values and the assertion values.

Preparation for the existing string matching rules would then be something
like:

Entry values are prepared as follows:
1) apply LDAPprep, collapsing adjacent spaces
2) remove all leading and trailing spaces.  This can result in an empty
string.

Filter substrings are prepared as follows:

Note: Its a bit unclear at the protocol level whether (ou=foo) should
result in an initial, any, or final filter component.  I assume all three
are valid and basically equivalent.  [filter] implies it should be an any
filter -- substring = attr EQUALS [initial] any [final].

1) apply LDAPprep.
2) if this is the first substring in a filter, remove leading spaces.  This
can result in an empty string.
3) if this is the last substring in a filter, remove trailing spaces.  This
can result in an empty string.


Using examples previously presented:
(ou= * ) matches everything (empty substring, *, empty substring)
(ou=* ) matches everything
(ou= *) matches everything

Its clear to me that not everyone will agree on this (a solution follows),
but the existing definitions of string matches imply that " foo " matches
"foo" and vice versa.  I don't see how to reconcile that with having " *"
match values with a leading space; having " foo*" match the value " foobar"
ought to imply that "foo*" should NOT match " foobar".

(l= * * ) matches values containing at least one space (empty substring, *,
" ", *, empty substring)

>From the note below:
(l=foo * bar) does not match "foo bar" -- it matches "foo X bar" where X is
zero or more non-whitespace characters.  To match "foo bar" would require
preparing "foo " and " bar" differently depending on the contents of the
other substrings - and not just the next or previous substrings.  That
seems unwieldly.  The alternate filter (&(l=foo *bar)(l=foo* bar)) would
match as Kurt seemed to be suggesting "foo * bar" should act.

And, should we ever resolve the empty IA5 String discussion:
"" matches an empty string or values consisting solely of whitespace

If matching with leading and/or trailing spaces is desired, I suggest
defining new matching rules like caseIgnoreWithWhitespaceMatch.  That name
suggests all whitespace is significant.  Maybe LDAPprep should have a
parameter determining whether spaces should be collapsed at all -- or maybe
it should be left to the matching rules to do this.  But my intent is that
this hypothetical caseIgnoreWithWhitespaceMatch rule could behave like:

Entry values are prepared as follows:
1) apply LDAPprep

Filter substrings are prepared as follws:
1) apply LDAPprep
2) if this is the first substring in a filter, remove all but one leading
spaces
3) if this is the last substring in a filter, remove all but one trailing
spaces

Then:
- " foo " does not match "foo" or vice versa
- " *" matches only values with leading whitespace, etc.
- " * * " matches strings with a leading space, trailing space, and at
least one other embedded space surrounded by non-whitespace


John  McMeeking


owner-ietf-ldapbis@OpenLDAP.org wrote on 11/17/2004 01:00:27 AM:

> At 09:11 PM 11/16/2004, Steven Legg wrote:
> >It is clear to me now that treating all strings and substrings exactly
> >the same way is the problem, no matter what that way is.
>
> I actually reached the same conclusion.  As it stands now,
> (l=foo * bar) will match "foobar".  That seems counter to
> the X.520/LDAP specification of these matching rules.
>
> >I am now arguing for LDAPprep and/or syntaxes to be revised so that
> whitespace treatment is dependent on the context of the (sub)string.
> Sometimes that means reducing a string of all spaces to an empty string,
> >and sometimes it doesn't.
>
> Problem is that this is not sufficient in the above case.
> (l=foo * bar) needs to match "foo bar" as well as "foo  bar"
> as well as "foo X bar"  We cannot simply apply LDAPprep to
> each substring and the value and match octet wise.  Also,
> as Rici noted, we have to be careful not that (l=x*)
> doesn't match "x'" where ' is a combining character.
>
> Seems what we to do is remove the insignificant character
> removal step from LDAPprep (or have it only collapse
> adjacent spaces into one space) and then deal with
> space issues in matching rule specification.
>
> This implies a divergence from Stringprep which calls
> for the output to compared code-point for code-point.
>
> Kurt
>