[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: non-UTF8 string Substrings matching



Kurt,

> -----Original Message-----
> From: owner-ietf-ldapbis@OpenLDAP.org
> [mailto:owner-ietf-ldapbis@OpenLDAP.org]On Behalf Of Kurt D. Zeilenga
> Sent: Tuesday, 24 October 2000 17:03
> To: ietf-ldapbis@OpenLDAP.org
> Subject: non-UTF8 string Substrings matching
> 
> 
> RFC 2251 defines a substrings filter as:
> 
>  SubstringFilter ::= SEQUENCE {
>    type            AttributeDescription,
>    -- at least one must be present
>    substrings      SEQUENCE OF CHOICE {
>      initial [0] LDAPString,
>      any     [1] LDAPString,
>      final   [2] LDAPString } }
>  LDAPString ::= OCTET STRING
> 
> where LDAPString is restricted to UTF-8 encoded ISO 10646-1
> character set.
> 
> This implies that octetSubstringsMatch cannot be specified
> as the SUBSTR matching rule of any attribute type as the
> asserted substrings are not restricted to UTF-8.
> 
> This also implies that (cn;binary=*hvalue*) [where hvalue
> is the hex-escaped BER encoded value] is invalid as the BER
> encoding itself is not restricted to UTF-8.
> 
> To allow non-UTF8 string substring assertions, is that it
> might be appropriate to change the ASN.1 to:
> 
>  SubstringFilter ::= SEQUENCE {
>    type            AttributeDescription,
>    -- at least one must be present
>    substrings      SEQUENCE OF CHOICE {
>      initial [0] LDAPSubstring,
>      any     [1] LDAPSubstring,
>      final   [2] LDAPSubstring } }
>    LDAPSubstring ::= OCTET STRING
> 
> where that actual value held in LDAPSubstring is restricted
> to the syntax appropriate for the substrings assertion.

In X.511, the substrings component of a FilterItem uses
(a definition equivalent to) AttributeValue for the initial, any
and final components. I suggest using AttributeValue instead of
defining LDAPSubstring. For one thing this makes the alignment
with X.500 more precise, and for another thing it relieves you
of any need to exhaustively describe what encoding goes in the
LDAPSubstring.

With or without this extra change I agree with your proposal.

[Aside: It would actually make more sense for X.500 and LDAP to
both use AssertionValue, but that is a different story.]

Regards,
Steven

> 
> For (cn=*value*), the LDAPSubstring is restricted to
> UTF-8.  For (cn;binary=*hvalue*), the LDAPSubstring
> must contain the BER-encoded directoryString asserted
> value.  For (1.2.3=*value*), where 1.2.3 SUBSTR matching
> rule is octetSubstringMatch, the LDAPSubstring may be
> any octet string.  Likewise for other substrings
> assertion syntaxes.
> 
> Basically the proposal is to trade one notational
> convenience for another such that we can describe the
> full range of behavior allowed by X.500.  That is,
> it makes possible the encoding of non-UTF8 substrings
> assertions.
> 
> Comments?
> 
> Kurt
> 
>