[Date Prev][Date Next] [Chronological] [Thread] [Top]

LDAPDN problems, and changes since RFC 2253



I've been working with parsing/unparsing of old and new LDAP DNs lately;
the experience was rather unsettling.

Some of this may have been discussed before, if so feel free to declare
the matter closed.

There are a lot of minor changes to this since RFC 2253, but only a few
are listed in draft-ietf-ldapbis-dn-14.txt Appendix B.  The rest (listed
below) should be added to Appendix B.  I also suggest that the appendix
is split in 3 sections: DN->string, string->DN and other changes.

Section 2.4 (Converting an AttributeValue from ASN.1 to a String):

* Mandate the #<hex...> form when attribute type is numericoid;
  this was merely a "SHOULD" in RFC 2253.
* Forbid null characters in the result; one must use '\00'.
* Allow '\=' in the result.

Section 3 (Parsing a String back to a Distinguished Name):

* In the attributeType:
  - Accept 1-letter attribute types and reject some invalid
    numericoids, due to the productions in [Models].

* In the attributeValue input string:
  - Reject null characters.
  - Accept '\ ' and unescaped '='.
  - Accept '#' except as the first character.

These changes are already noted:

>    - Updated Section 2.4 to allow hex pair escaping of all characters
>      and clarified escaping for when multiple octet UTF-8 echodings are
>      present.

OK,

>    - Revised specification (in Section 2) to allow short names of any
>      registered attribute type to appear in string representations of
>      DNs instead of being restricted to a "published table".  Remove
>      "as an example" language.  Added statement (in Section 3) allowing
>      recognition of additional names but require recognization of those
>      names in the published table.  The table is now published in
>      Section 3.

The result is that implementations need not recognize all string
representations that they produce: They are to produce short names if
the attribute type is in the registry (section 2.3), but need only
recognize the short names in the table in section 3.

>    - Replaced specification of additional requirements for LDAPv2
>      implementations which also support LDAPv3 (RFC 2253, Section 4)
>      with a statement (in Section 3) allowing recognition of
>      alternative string representations.

That one is more far-reaching than it looks.  It causes two changes to
attribute value parsing:

- If one did not implement LDAPv2 compatibility, unescaped SPACE at
  the beginning and end was valid and not ignored when the string
  was parsed.  Now it is an error.

- If one did implement LDAPv2 compatibility, trailing whitespace was
  ignored before the comma.  Now it is significant.  SPACE at the end
  is an error, but other whitespace is valid.

  However, the RFC 2253 DN->string algorithm did _not_ say that trailing
  whitespace other than SPACE must be escaped.

  If LDAPv2 compatibility was implemented, SPACE was also ignored around
  '+' and '='.  I expect some implementations treat other whitespace the
  same way, in which case the comment above applies to that too.  Other
  implementations may treat other whitespace as normal characters even
  before comma.

  Also, since Section 4 (LDAPv2 compat.) says SPACE shall be ignored
  around [,;+=], it would not work to escape a trailing SPACE as '\ '.
  'cn=foo\ ,o=bar' would become ('cn=foo\', 'o=bar') before one got
  around to parsing the RDNs.  It seems safe to assume there are
  implementations around that do that.

  I suspect SPACE, or at least the trailing SPACE, was intended to be
  escaped as '\20', not as '\ ': Note the wording of RFC 2253 2.4:

      o   a space or "#" character [...at beginning...]
      o   a space character [...at end...]
      o   one of the characters ",", "+", """, "\", "<", ">" or ";"
      [...]
      If a character to be escaped is one of the list shown above, then
      it is prefixed by a backslash ('\' ASCII 92).

  Maybe 'the list above' only meant the line with ","...";", not space
  and '#'.  If so, production of '\ ' and '\#' are two other changes
  since RFC 2253.  (It's hard to tell: The RFC grammar accepts '\#' but
  rejects '\ '.  For that matter, the RFC is buggy about '=' vs. '\=':
  It only produces '=' but only accepts '\='.)

Anyway, this means the new draft describes neither the "plain LDAPv3"
nor the "LDAPv3 + LDAPv2" semantics of RFC 2253 in this respect.

I suggest that:

- Other ASCII whitespace than SPACE as the first or last character of
  the AttributeValue must also be escaped.

- A trailing ASCII whitespace must, or at least SHOULD, be escaped
  with the \<hex> form, not as '\ '.  Or the '\ ' form could be
  removed altogether from Section 2.4 (but not from Section 3).

- Section 3 (string->DN) adds a note about preceding and trailing
  whitespace other than SPACE, but I don't know what.  Maybe it should
  be legal both to reject them, to strip them (plus any SPACE hidden
  by them), and to accept them as part of the AttributeValue.

- Section 2.4 adds a note that RFC 2253 was buggy and implementations
  therefore were somewhat incompatible, but if one wishes the best
  possible RFC 2253 compatibility, one should not produce '=', '\=',
  '#', or '\ ', nor ASCII whitespace as the first or last character.
  Instead use the \hex form (and maybe '\#' for '#').

About the '\ ' form, I note that there according the C standard are file
systems that strip trailing spaces from lines in text files.  But maybe
DNs are not intended to be stored in text files anyway, since they can
contain control characters like CR and LF.  Come to think of it, even
the LDIF format allows plain ^Z characters, which I don't think DOS
file systems are going to like.


A few other notes:

In Section 3:

>    string     = [ (leadchar / pair)
>                 [ *( stringchar / pair ) ( trailchar / pair ) ] ]

I suggest to indent the 2nd line of the string production by 2 spaces:
And maybe remove space after ( and before ) to shorten the line.

In Appendix B:

>    - Updated Section 2.3 to indicate attribute type name strings are
>      case insensitive.

Where?  I can't see it.

-- 
Hallvard