[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Escaping within distinguished names: RFC 2253



I asked:

> 1.  What is the precise meaning of "whitespace" as used here?

Mark writes:

>ASCII 32.
>
>   Implementations MUST allow for space (' ' ASCII 32) characters to be
>   present between name-component and ',', between attributeTypeAndValue
>   and '+', between attributeType and '=', and between '=' and
>   attributeValue.  These space characters are ignored when parsing.

Yes, and it's clear that only space characters are allowed before '='
and '+'.  However, just before the cited paragraph, the RFC says:

   Implementations MUST...
   allow whitespace characters to be present on either side of the comma
   or semicolon.  The whitespace characters are ignored....

This is what suggested to me that whitespace might have some other
definition, otherwise why word things differently and why reiterate
the rule for commas?  But granting that it's been clarified that
whitespace includes only space characters, we get to Jim's question:


>If whitespace only includes space characters, do these problems go away?

Not quite.  Section 4 requires parsers to handle RFC 1779 names, and
those names may contain carriage return <CR> characters that get
ignored.  So we wind up with no unambiguous answer to questions such
as:  What is the value of the CN attribute in the name "CN=<CR>ab"?


Jim also writes:

>I think ["whitespace"] should be defined in the BNF as it is throughout
>RFC2251 (throught the use of the whsp definition).

I think Jim meant RFC 2252, in which "whsp" is defined as spaces
only.  This fits with what Mark said.

To resolve the problem of parsing RFC 1779 names, we could then require both
spaces and carriage returns to be escaped when they appear at the start or
end of an attribute value.


Chris writes:

>In other words, the parsing should handle
>escaped or non-escaped "#" symbols but applications should only generate
>values that follow the grammar.

A good rule of thumb when implementing standards is to be conservative
in what you generate and liberal in what you accept.  Even so, the
RFC as it stands is self-contradictory and should be fixed.  Section 2.4
allows the name "CN=a#b" to be generated, while the grammar in section 3
does not allow it to be parsed.


Scott Seligman
Java Software Engineering
Sun Microsystems