[Date Prev][Date Next] [Chronological] [Thread] [Top]

Fw: Reserved characters for a LDAP URI



Dear LDAP and PKIX wizards,

The following transcript is from Asia PKI Interoperability WG.
Taiwan side and Japan side made some arguments about whether
a comma character ',' inside a RDN of the DN component of
a LDAP URI should be escaped with '%" method. Currenttly,
some Asia countries are made some PKI interoperability test,
and in our certificate profiles, we put a LDAP URI in the CRL
Distribution Points extension of a certificate to point to the LDAP
entry where the CRL is published.

For example, for the following DN:
  cuntry name: TW
  organization name: Chunghwa Telecom Co., Ltd.
  organizational unit name: PKI IWG

Taiwan side believes that the DN component in a LDAP URI should
be:

ou=PKI%20IWG,o=Chunghwa%20Telecom%20Co.%5C,%20Ltd.,c=TW

However, Japan side insists that it should be:

ou=PKI%20IWG,o=Chunghwa%20Telecom%20Co.%5C%2C%20Ltd.,c=TW

(The difference is between "%5C,%20" and "%5C%2C%%20".)

After those long discussions, we still can not achieve the consensus of
opinion. So,
I forward the following transcript to LDAPbis and PKIX mailing list. Please
make
comments to help us to resolve the arguments.

Thanks in advance!

Wen-Cheng Wang (wcwang@cht.com.tw)
Telecommunication Labs,
Chunghwa Telecom Co., Ltd.

----- Original Message -----
From:  "SATOSHI TAKEMOTO"
To: <wcwang@cht.com.tw>
Sent: Saturday, December 21, 2002 5:36 PM
Subject: [iwg_tech:290] Re: Reserved characters for a LDAP URI


> Dear Dr. Wang,
>
> I appreciate your hard work.
> I think that the opinion is different in "What a component is".
>
> Mr. Wang saids as follows.
> > Thus, a LDAP URI is consists of a scheme component (that is "ldap"), a
> > <hostport> component, a <dn> component, a <scope> component, a
> > <filter> component, and am <extensions> component. Of course, some
> > of these components are optional.
>
> I think, since "DN" is a set of "RDN", we should consider that not
> "DN" but "RDN" is one of the "component". If it does so, the comma
> character should be escaped. However, this opinion may be wrong, when
> component of "URI" and component of "DN" are considered independently.
>
> As a result of research after this time, if it turns out that the
> comma character must be escaped , I will report.
>
> Best Regards & I wish you a merry Christmas.
> --
> Satoshi TAKEMOTO
>
>  >>>>> On Thu, 19 Dec 2002 22:12:47 +0800,
> "Wen-Cheng Wang" <wcwang@cht.com.tw> wrote:
>
> > Dear Satoshi,
> >
> > After take a very careful study on RFC 1738, RFC 2396, RFC 2253,
> > RFC 2255, RFC 2255 and the draft-ietf-ldapbis-url-02.txt (the revising
> > RFC 2255bis from LDAPbis). I make a conclusion that the comma
> > character ',' is not a reserved character for the DN component of
> > a LDAP URI. I know that this conclusion might suprise many people.
> > However, let me explain why it is not.
> >
> > The decision of whether a character is a reserved character is a matter
> > of context. I mean a character (for example, the comma character ',')
> > may be a reserved character for a URI scheme, but it may be not a
> > reserved character for another URI scheme. Even in the same URI
> > scheme, a character may be a reserved character in a component of
> > a URI, but it may be not a reserved character in another component
> > of that URI. When a character appears in a context where it is
> > not "reserved", it does not need to be escaped.
> >
> > The following text are quoted from RFC 2396.
> >
> > ----- Begin of text quoted from RFC 2396 -----
> >
> > 2.2. Reserved Characters
> >
> >    The "reserved" syntax class above refers to those characters that are
> >    allowed within a URI, but which may not be allowed within a
> >    particular component of the generic URI syntax; they are used as
> >    delimiters of the components described in Section 3.
> >
> >    Characters in the "reserved" set are not reserved in all contexts.
> >    The set of characters actually reserved within any given URI
> >    component is defined by that component. In general, a character is
> >    reserved if the semantics of the URI changes if the character is
> >    replaced with its escaped US-ASCII encoding.
> >
> > 2.4.2. When to Escape and Unescape
> >
> >    A URI is always in an "escaped" form, since escaping or unescaping a
> >    completed URI might change its semantics.  Normally, the only time
> >    escape encodings can safely be made is when the URI is being created
> >    from its component parts; each component may have its own set of
> >    characters that are reserved, so only the mechanism responsible for
> >    generating or interpreting that component can determine whether or
> >    not escaping a character will change its semantics. Likewise, a URI
> >    must be separated into its components before the escaped characters
> >    within those components can be safely decoded.
> >
> > ----- End of text quoted from RFC 2396 -----
> >
> > So, what are "components" for a URI? The following text are quoted from
> > RFC 2396 Section 3:
> >
> > ----- Begin of text quoted from RFC 2396 -----
> >
> >    This "generic URI" syntax consists of a sequence of four main
components:
> >
> >       <scheme>://<authority><path>?<query>
> >
> >    each of which, except <scheme>, may be absent from a particular URI.
> >    For example, some URI schemes do not allow an <authority> component,
> >    and others do not use a <query> component.
> >
> > ----- Begin of text quoted from RFC 2396 -----
> >
> > So, what are "reserved" characters? According to RFC 2396 Section 2.2,
> > a reserved characters is those "used as delimiters of the components
> > described in Section 3". They are reserved becasue they are used as
> > delimiters of the components, so they can not appear in a component
> > without escaping, or the parser of the URI may be confused.
> >
> > So, what are components of a LDAP URI? Let's take a look at the format
of
> > a LDAP URI. According to RFC 2255 and RFC 2255bis, the format of a
> > LDAP URI is as follows:
> >
> >     ldapurl    = scheme "://" [hostport] ["/" dn ["?" [attributes] ["?"
[scope]
> >                                      ["?" [filter] ["?" extensions]]]]]
> >
> > That is:
> >
> >     ldap://<hostport>/<dn>?<attributes>?<scope>?<filter>?<extensions>
> >
> > Thus, a LDAP URI is consists of a scheme component (that is "ldap"), a
> > <hostport> component, a <dn> component, a <scope> component, a
> > <filter> component, and am <extensions> component. Of course, some
> > of these components are optional.
> >
> > So, what are "reserved" characters for a LDAP URI? Since the question
> > mark '?' is the only character used as a delimiter of the components of
a
> > LDAP URI, I conclude that it is the only reserved character for a LDAP
> > URI, with an exception that a comma character ',' is also a reserved
> > character if it appears inside an extension value in the <extensions>
> > component.
> >
> > Let's take a look at how RFC 2255 and RFC 2255bis say about reserved
> > characters for a LDAP URI:
> >
> > ----- Begin of text quoted from RFC 2255 Section 3 -----
> >
> >    Note that any URL-illegal characters (e.g., spaces), URL special
> >    characters (as defined in section 2.2 of RFC 1738) and the reserved
> >    character '?' (ASCII 63) occurring inside a dn, filter, or other
> >    element of an LDAP URL MUST be escaped using the % method described
> >    in RFC 1738 [5]. If a comma character ',' occurs inside an extension
> >    value, the character MUST also be escaped using the % method.
> >
> > ----- End of text quoted from RFC 2255 Section 3 -----
> >
> > ----- Begin of text quoted from RFC 2255bis Section 3 -----
> >
> >    Note that characters that are not safe (e.g., spaces) (as defined in
> >    section 2.1 of [RFC2396]), and the single Reserved character '?'
> >    occurring inside a dn, filter, or other element of an LDAP URL MUST
> >    be escaped using the % method described in section 2.4 of [RFC2396].
> >    If a comma character ',' occurs inside an extension value, the
> >    character MUST also be escaped using the % method.
> >
> > ----- End of text quoted from RFC 2255 Section 3 -----
> >
> > Please note that both RFCs mention that the question mark '?' is the
> > single reserved characher for a dn, filter, or other components of
> > a LDAP URI. Except that a comma character ',' occurs inside an extension
> > value is also reserved. (I believe that it is because the comma
character
> > ','
> > already be used as a delimiter of extension inside the <extensions>
> > component, and there is not a escape mechanism like the backslash
> > '\' prfix defined in RFC 2253 for distinguish a delimeter comma
character
> > with a comma charachter appears in a RDN (Relative DN).)
> >
> > So, I make a conclusion that the comma character ',' is not a reserved
> > character inside the DN component of a LDAP URI.
> >
> > Let take a full DN into consideration.
> >
> > Suppose a full DN is as follows:
> >
> > cuntry name: TW
> > organization name: Chunghwa Telecom Co., Ltd.
> > organizational unit name: PKI IWG
> >
> > After applying RFC 2253 encoding, its LDAP DN string will be:
> >   ou=PKI IWG,o=Chunghwa Telecom Co.\, Ltd.,c=TW
> >
> > (I believe that we are all agree with this.)
> >
> > Since the comma charachter ',' is not a reserved character inside the
> > DN component, there is no need to escape it. However, the space
> > charachter and the backslash character '\' are "unsafe" charachters
> > listed in RFC 1738 Secion 2.2 and are "Excluded US-ASCII Characters"
> > listed in RFC 2396 Section 2.4.3, they must be always escaped as %20
> > and %5C respectively.
> >
> > So, I believe that the following encoding of the DN component in a
> > LDAP URI for the previous LDAP DN string is correct:
> >
> >   ou=PKI%20IWG,o=Chunghwa%20Telecom%20Co.%5C,%20Ltd.,c=TW
> >
> > However, since RFC 1738 Section 2.2 says that:
> >
> >    On the other hand, characters that are not required to be encoded
> >    (including alphanumerics) may be encoded within the scheme-specific
> >    part of a URL, as long as they are not being used for a reserved
> >    purpose.
> >
> > I think that the following encoding of the DN component in a
> > LDAP URI are all correct:
> >
> >      ou=PKI%20IWG,o=Chunghwa%20Telecom%20Co.%5C%2C%20Ltd.,c=TW
> >      ou=PKI%20IWG%2Co=Chunghwa%20Telecom%20Co.%5C%2C%20Ltd.%2Cc=TW
> >      ou%3DPKI%20IWG%2Co%3DChunghwa%20Telecom%20Co.%5C%2C%20Ltd.%2Cc%3DTW
> >
> > However, please note that RFC 2369 now discourage these kinds of
> > encoding. In RFC 2369 Section 2.3, it says that:
> >
> >    Unreserved characters can be escaped without changing the semantics
> >    of the URI, but this should not be done unless the URI is being used
> >    in a context that does not allow the unescaped character to appear.
> >
> > Best Regards,
> > Wen-Cheng Wang
> >
> > ----- Original Message -----
> > From: "SATOSHI TAKEMOTO"
> > To: <wcwang@cht.com.tw>
> > Sent: Thursday, December 19, 2002 2:01 PM
> > Subject: [iwg_tech:268] Fw: this is a secure e-mail from JP
> >
> >
> > > Dear Dr. Wang,
> > >
> > >  As for the conversion of "\," in the LDAP URI, I think that the
following
> > > procedures are required.
> > >    Chunghwa Telecom Co., Ltd. -> (RFC2253) -> Chunghwa Telecom Co\.,
Ltd. ->
> > >  (RFC2396) -> Chunghwa%20Telecom%20Co.%5C%2C%20Ltd.
> > >
> > >  That is, according to RFC2253, "," is set to "\,",
> > > and according to RFC2396, "\," is set to "%5C%2C".
> > > Since RFC2253 and RFC2396 are the independent specifications,
> > > we should think independently.
> > > And when we convert to URI from DN,
> > > I think that it is necessary to convert "\" and "," independently.
> > > So, I think that need to escape it twice.
> > >
> > > Best Regards,
> > > --
> > > Satoshi TAKEMOTO
> > >
> > >
> > > > ----- Original Message -----
> > > > From: "Wen-Cheng Wang" <wcwang@cht.com.tw>
> > > > To: "SATOSHI TAKEMOTO"
> > > > Sent: Wednesday, December 18, 2002 8:23 PM
> > > > Subject: Re: this is a secure e-mail from JP
> > > >
> > > >
> > > > > Dear Satoshi,
> > > > >
> > > > > As for the conversion of "\," in the LDAP URI, my opinion is a
follows:
> > > > >
> > > > > I believe that any "," character within the DN component in a LDAP
URI
> > > > > does not need to be escape with the "%" notation, even it is not
used
> > > > > as a delimiter.
> > > > > For reserved characters, RFC 2396 (or RFC 1738) means that a
> > > > > reserved character does not need to be escaped within the URI
> > > > > component in which the character has a reserved purpose. The ","
> > > > > character is a reserved character in the DN component of a LDAP
URI,
> > > > > that means it does not need to be escaped. Although the "," in
"\," is not
> > > > > used as a delimiter, but since it is already escaped by the
backslash ("\")
> > > > > prefix as RFC 2253 requested, I believe that we need not to escape
it
> > > > > twice.
> > > > >
> > > > > On the other hand, since  the "," in "\," is not used as a
delimiter, I
> > > > > think
> > > > > is is ok to escape it with the "%" notation. That is, sematically
and
> > > > > syntactically both "%5C," and "%5C%2C" are fine. However, we still
> > > > > need to test if all LDAP servers and clients is compatible with
the latter.
> > > > >
> > > > > Maybe we should forward our discussion to the LDAP mailing list?
Maybe
> > > > > they have already discuss this issue before?
> > > > >
> > > > > Best Regards,
> > > > > Wen-Cheng Wang
> > > > >
> > > > > ----- Original Message -----
> > > > > From: "SATOSHI TAKEMOTO"
> > > > > To: "Wen-Cheng Wang" <wcwang@cht.com.tw>
> > > > > Sent: Tuesday, December 17, 2002 10:00 PM
> > > > > Subject: this is a secure e-mail from JP
> > > > >
> > > > >
> > > > > > Dear Mr. Wang
> > > > > >
> > > > > >   This is Satoshi TAKEMOTO from JP side.
> > > > > >
> > > > > > To my understanding, I think that URI of cRLDP will become the
> > > > > > following description.
> > > > > >
> > > > > >
ldap://pkiiwg.chttl.com.tw/cn=CRLforPhase2,ou=Test%20Root%20CA,ou=PKI%20IWG,
o=Chunghwa%20Telecom%20Co.%5C%2C%20Ltd.,c=TW?certificateRevocationList
> > > > > >
> > > > > > That is, I think that the conversion of "\," in the LDAP URI is
not
> > > > > > "%5C," but "%5C%2C". According to RFC2396, "," MUST be escaped,
I
> > > > > > think.
> > > > > >
> > > > > > ----- <cited from rfc2396> -----
> > > > > > 2.2. Reserved Characters
> > > > > >
> > > > > >    Many URI include components consisting of or delimited by,
certain
> > > > > >    special characters.  These characters are called "reserved",
since
> > > > > >    their usage within the URI component is limited to their
reserved
> > > > > >    purpose.  If the data for a URI component would conflict with
the
> > > > > >    reserved purpose, then the conflicting data must be escaped
before
> > > > > >    forming the URI.
> > > > > >
> > > > > >       reserved    = ";" | "/" | "?" | ":" | "@" | "&" | "=" |
"+" |
> > > > > >                     "$" | ","
> > > > > > ----- </cited from rfc2396> -----
> > > > > >
> > > > > > However, RFC1738 is quoted in RFC2255, and according to RFC1738,
","
> > > > > > do not need to be escaped.  (Although it thinks that it is
unnecessary
> > > > > > addition, RFC1738 is canceled and is updated RFC2396 now.)
> > > > > >
> > > > > > Although I think that "%5C%2C" is right in a technical
viewpoint,
> > > > > > I think it is a very difficult problem that how it is described
when
> > > > > > backward-compatibility is considered.
> > > > > >
> > > > > > thanks.
> > > > > > --
> > > > > > Satoshi TAKEMOTO