[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: LDAPprep



I was a little to quick in pressing "Send".
Fixing a couple of typos, the replacement text would be:

  For input strings which are attribute values or non-substring
  assertion values:  If the input string contains no non-space    
  character, then the output is exactly two SPACEs.   Otherwise (the
  input string contains at least one non-space character) then the
  string is modified such that the string starts with exactly one space
  character, ends with exactly one SPACE character, and that any inner
  (non-empty) sequence of space characters is replaced with exactly two
  SPACE characters.  For instance, the input strings                 
  "foo<SPACE>bar<SPACE><SPACE>", results in the output
  "<SPACE>foo<SPACE><SPACE>bar<SPACE>".

  For input strings which are substring assertion values: If the string
  being prepared contains no non-space characters, then the output
  string is exactly one SPACE.  Otherwise, the following steps are
  taken:
    - If the input string is an initial substring, it is modified to
      start with exactly one SPACE character;
    - If the input string is an initial or an any substring which ends
      in one or more space characters, it is modified to end with       
      exactly one SPACE character;
    - If the input string is an any or a final substring which starts in
      one or more space characters, it is modified to start with exactly
      one SPACE character; and
    - If the input string is a final substring, it is modified to end
      with exactly one SPACE character.
  For instance, for the input string "foo<SPACE>bar<SPACE><SPACE>" as an
  initial substring, the output would be
  "<SPACE>foo<SPACE><SPACE>bar<SPACE>".  As an any or final substring,
  the same input would result in "foo<SPACE>bar<SPACE>".



At 12:28 PM 5/4/2006, Kurt D. Zeilenga wrote:
>David,
>
>Thanks for raising these issues.  I note the I-D in question
>is in the RFC-Editor queue, I have notified our AD that LDAPBIS
>is discussing these issues and will likely propose a change
>be made prior to publication.  I'll work with the AD to
>determine how best to accomplish this.
>
>As indicated by my above comments, I concur that there is
>at least one issue significant enough to warrant a last
>minute fix.  Details below:
>
>At 11:15 AM 5/4/2006, David Wilson wrote:
>>I've been looking at draft-ietf-ldapbis-strprep-07, and there seems to
>>be a serious problem in the area of substring matches.
>>
>>Section 2.6.1 states that if the string contains any non-space
>>characters then it is modified to start and finish with a space, and any
>>internal sequences of spaces are altered to be two spaces. This appears
>>to apply to substring filter strings. (The following paragraph has a
>>specific exception for these for the case of only spaces in the value).
>>
>>But if this is done, and you have, say,
>>
>>        (cn=*bar*)
>>
>>that is not going to match a value of "foobar", as the 'any' string
>>becomes "<SPACE>bar<SPACE>" by the above rule, the value being matched
>>becomes "<SPACE>foobar<SPACE>" which does not contain the substring.
>>
>>The overall scheme would work, but you need more complicated rules for
>>substring filter strings. Inner sequences of spaces become two spaces.
>>Leading or trailing sequences become one space, but spaces are NOT added
>>at the ends except:
>>
>>- a space at the start of an initial substring
>>- a space at the end of a final substring
>
>I concur.
>
>The fix would be to replace:
> If the input string contains at least one non-space character, then
> the string is modified such that the string starts with exactly one
> space character, ends with exactly one SPACE character, and that
> any inner (non-empty) sequence of space characters is replaced with
> exactly two SPACE characters.  For instance, the input strings
> "foo<SPACE>bar<SPACE><SPACE>", results in the output
> "<SPACE>foo<SPACE><SPACE>bar<SPACE>".
>
> Otherwise, if the string being prepared is an initial, any, or final
> substring, then the output string is exactly one SPACE character,
> else the output string is exactly two SPACEs.
>
>with:
> For input strings which are attribute values or non-substring
> assertion values:  If the input string contains no non-space
> character, then the output is exactly two SPACEs.   Otherwise
> (the input string contains at least one non-space character)
> then the string is modified such that the string starts
> with exactly one space character, ends with exactly one SPACE
> character, and that any inner (non-empty) sequence of space
> characters is replaced with exactly two SPACE characters.  For
> instance, the input strings "foo<SPACE>bar<SPACE><SPACE>",
> results in the output "<SPACE>foo<SPACE><SPACE>bar<SPACE>".
>
> For input strings which are substring assertion values: If the
> string being prepared contains no non-space characters, then the
> output string is exactly one SPACE.  Otherwise, the following steps
> are taken:
>  - If the input string is an initial substring, it is modified to
>    start with exactly one SPACE character;
>  - If the input string is an initial or an any substring which ends in
>    one or more space characters, it is modified to end with exactly
>    one SPACE character;
>  - If the input string is an any or a final substring which ends in
>    one or more space characters, it is modified to end with exactly
>    one SPACE character; and
>  - If the input string is a final substring, it is modified to end
>    with exactly one SPACE character.
> For instance, for the input string "foo<SPACE>bar<SPACE><SPACE>"
> as an initial substring, the output would be
> "<SPACE>foo<SPACE><SPACE>bar<SPACE>".  As an any or final substring,
> the same input would result in "foo<SPACE>bar<SPACE>".
>
>
>>I have two other minor comments on this draft, not directly related to
>>the above.
>>
>>draft-ietf-ldapbis-syntaxes-11 does not change the definition of the
>>telephone number syntax nor the definition of facsimile telephone
>>number. In both cases the number is a PrintableString. So, I'm not sure
>>why "2.6.3 telephoneNumber Insignificant Character Handling" needs to
>>make allowance for non-PrintableString hyphen-type characters.
>
>I note there is a third case: values carried in a Substring Assertion.
>That aside...
>
>We certainly could have regarded only U+002D as a hyphen here.
>We didn't.  That is, I suggest we not regard this as significant
>(one requiring we consider changes to an approved specification)
>technical issue.
>
>>In Appendix B, an alternative scheme for insignificant space handing is
>>described. In conjunction with substring matching, this alternative
>>scheme tends to make substrings in the filter shorter, by removing
>>leading and trailing spaces. Therefore you get matches which you don't
>>expect, rather than not getting matches you do expect. In particular,
>>the first case erroneously states that (with this mechanism) (cn=foo\20*
>>\20bar) would NOT match CN values "foo<SPACE>bar" and
>>"foo<SPACE><SPACE>bar", but it would. The initial and final substrings
>>are reduced to "foo" and "bar", and so match the prepared values, which
>>both become "foo<SPACE>bar". The same applies to the third example,
>>which uses the same filter and one of the same value strings.
>
>Here I think that I was treating the space in (cn=foo<SPACE>*) (in
>at least some instances) as an inner space.  Aside from dropping
>the "would not" part of item 1, I need to rephrase item 3 as
>indicative as why another simple approach (just leaving these
>single spaces in) is also problematic.  Luckily this is in
>non-normative background text.
>
>Thanks, Kurt