[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: non-UTF8 string Substrings matching



Kurt,

I don't think xx;binary is an appropriate target for this discussion. It is
a way of specifying the transfer syntax. The matching rules should address
only the (conceptual) stored syntax.

Further, I cannot see the point of extending substring-match to non-strings.

Therefore, I cannot see why the syntax should be changed.

Ron.

-----Original Message-----
From: Kurt D. Zeilenga [mailto:Kurt@OpenLDAP.org]
Sent: Tuesday, 24 October 2000 17:03
To: ietf-ldapbis@OpenLDAP.org
Subject: non-UTF8 string Substrings matching


RFC 2251 defines a substrings filter as:

 SubstringFilter ::= SEQUENCE {
   type            AttributeDescription,
   -- at least one must be present
   substrings      SEQUENCE OF CHOICE {
     initial [0] LDAPString,
     any     [1] LDAPString,
     final   [2] LDAPString } }
 LDAPString ::= OCTET STRING

where LDAPString is restricted to UTF-8 encoded ISO 10646-1
character set.

This implies that octetSubstringsMatch cannot be specified
as the SUBSTR matching rule of any attribute type as the
asserted substrings are not restricted to UTF-8.

This also implies that (cn;binary=*hvalue*) [where hvalue
is the hex-escaped BER encoded value] is invalid as the BER
encoding itself is not restricted to UTF-8.

To allow non-UTF8 string substring assertions, is that it
might be appropriate to change the ASN.1 to:

 SubstringFilter ::= SEQUENCE {
   type            AttributeDescription,
   -- at least one must be present
   substrings      SEQUENCE OF CHOICE {
     initial [0] LDAPSubstring,
     any     [1] LDAPSubstring,
     final   [2] LDAPSubstring } }
   LDAPSubstring ::= OCTET STRING

where that actual value held in LDAPSubstring is restricted
to the syntax appropriate for the substrings assertion.

For (cn=*value*), the LDAPSubstring is restricted to
UTF-8.  For (cn;binary=*hvalue*), the LDAPSubstring
must contain the BER-encoded directoryString asserted
value.  For (1.2.3=*value*), where 1.2.3 SUBSTR matching
rule is octetSubstringMatch, the LDAPSubstring may be
any octet string.  Likewise for other substrings
assertion syntaxes.

Basically the proposal is to trade one notational
convenience for another such that we can describe the
full range of behavior allowed by X.500.  That is,
it makes possible the encoding of non-UTF8 substrings
assertions.

Comments?

Kurt