[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: LDAPprep


Thanks for raising these issues.  I note the I-D in question
is in the RFC-Editor queue, I have notified our AD that LDAPBIS
is discussing these issues and will likely propose a change
be made prior to publication.  I'll work with the AD to
determine how best to accomplish this.

As indicated by my above comments, I concur that there is
at least one issue significant enough to warrant a last
minute fix.  Details below:

At 11:15 AM 5/4/2006, David Wilson wrote:
>I've been looking at draft-ietf-ldapbis-strprep-07, and there seems to
>be a serious problem in the area of substring matches.
>Section 2.6.1 states that if the string contains any non-space
>characters then it is modified to start and finish with a space, and any
>internal sequences of spaces are altered to be two spaces. This appears
>to apply to substring filter strings. (The following paragraph has a
>specific exception for these for the case of only spaces in the value).
>But if this is done, and you have, say,
>        (cn=*bar*)
>that is not going to match a value of "foobar", as the 'any' string
>becomes "<SPACE>bar<SPACE>" by the above rule, the value being matched
>becomes "<SPACE>foobar<SPACE>" which does not contain the substring.
>The overall scheme would work, but you need more complicated rules for
>substring filter strings. Inner sequences of spaces become two spaces.
>Leading or trailing sequences become one space, but spaces are NOT added
>at the ends except:
>- a space at the start of an initial substring
>- a space at the end of a final substring

I concur.

The fix would be to replace:
 If the input string contains at least one non-space character, then
 the string is modified such that the string starts with exactly one
 space character, ends with exactly one SPACE character, and that
 any inner (non-empty) sequence of space characters is replaced with
 exactly two SPACE characters.  For instance, the input strings
 "foo<SPACE>bar<SPACE><SPACE>", results in the output

 Otherwise, if the string being prepared is an initial, any, or final
 substring, then the output string is exactly one SPACE character,
 else the output string is exactly two SPACEs.

 For input strings which are attribute values or non-substring
 assertion values:  If the input string contains no non-space
 character, then the output is exactly two SPACEs.   Otherwise
 (the input string contains at least one non-space character)
 then the string is modified such that the string starts
 with exactly one space character, ends with exactly one SPACE
 character, and that any inner (non-empty) sequence of space
 characters is replaced with exactly two SPACE characters.  For
 instance, the input strings "foo<SPACE>bar<SPACE><SPACE>",
 results in the output "<SPACE>foo<SPACE><SPACE>bar<SPACE>".

 For input strings which are substring assertion values: If the
 string being prepared contains no non-space characters, then the
 output string is exactly one SPACE.  Otherwise, the following steps
 are taken:
  - If the input string is an initial substring, it is modified to
    start with exactly one SPACE character;
  - If the input string is an initial or an any substring which ends in
    one or more space characters, it is modified to end with exactly
    one SPACE character;
  - If the input string is an any or a final substring which ends in
    one or more space characters, it is modified to end with exactly
    one SPACE character; and
  - If the input string is a final substring, it is modified to end
    with exactly one SPACE character.
 For instance, for the input string "foo<SPACE>bar<SPACE><SPACE>"
 as an initial substring, the output would be
 "<SPACE>foo<SPACE><SPACE>bar<SPACE>".  As an any or final substring,
 the same input would result in "foo<SPACE>bar<SPACE>".

>I have two other minor comments on this draft, not directly related to
>the above.
>draft-ietf-ldapbis-syntaxes-11 does not change the definition of the
>telephone number syntax nor the definition of facsimile telephone
>number. In both cases the number is a PrintableString. So, I'm not sure
>why "2.6.3 telephoneNumber Insignificant Character Handling" needs to
>make allowance for non-PrintableString hyphen-type characters.

I note there is a third case: values carried in a Substring Assertion.
That aside...

We certainly could have regarded only U+002D as a hyphen here.
We didn't.  That is, I suggest we not regard this as significant
(one requiring we consider changes to an approved specification)
technical issue.

>In Appendix B, an alternative scheme for insignificant space handing is
>described. In conjunction with substring matching, this alternative
>scheme tends to make substrings in the filter shorter, by removing
>leading and trailing spaces. Therefore you get matches which you don't
>expect, rather than not getting matches you do expect. In particular,
>the first case erroneously states that (with this mechanism) (cn=foo\20*
>\20bar) would NOT match CN values "foo<SPACE>bar" and
>"foo<SPACE><SPACE>bar", but it would. The initial and final substrings
>are reduced to "foo" and "bar", and so match the prepared values, which
>both become "foo<SPACE>bar". The same applies to the third example,
>which uses the same filter and one of the same value strings.

Here I think that I was treating the space in (cn=foo<SPACE>*) (in
at least some instances) as an inner space.  Aside from dropping
the "would not" part of item 1, I need to rephrase item 3 as
indicative as why another simple approach (just leaving these
single spaces in) is also problematic.  Luckily this is in
non-normative background text.

Thanks, Kurt