[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: non-UTF8 string Substrings matching



Ron,

> -----Original Message-----
> From: owner-ietf-ldapbis@OpenLDAP.org
> [mailto:owner-ietf-ldapbis@OpenLDAP.org]On Behalf Of Ramsay, Ron
> Sent: Wednesday, 25 October 2000 13:33
> To: Kurt D. Zeilenga
> Cc: ietf-ldapbis@OpenLDAP.org
> Subject: RE: non-UTF8 string Substrings matching
> 
> 
> Kurt,
> 
> What do I mean by strings?
> 
> LDAP specifies a string encoding of attribute values and a 
> string encoding
> of distinguished names. The encoding specifies UTF-8. These are the
> 'strings' of LDAP.

This doesn't fit my definition of strings. I suspect it wouldn't
be Kurt's either. The attributes with a string syntax are those
whose data type is one of the ASN.1 string types (PrintableString,
TeletexString, etc) or OCTET STRING, or a syntactic CHOICE of
string types like DirectoryString. Everything else isn't a string type.
Every data type needs encoding rules so that values of the data type
can be sent in protocol messages. X.500 exclusively uses BER for the
encoding. LDAP allows BER but also defines human readable string
encodings for a number of syntaxes. Having an LDAP string encoding
doesn't make a data type a string type, so for example, DistinguishedName
has an LDAP string encoding (RFC 2253) but it isn't a string data type.
Substring matching only applies to string types. 

> 
> Some attributes do not have a string encoding. Their values are not
> 'strings'.

They are usually not string types, but that isn't because they lack
an LDAP string encoding. TeletexString isn't a defined LDAP syntax and
it doesn't have a defined LDAP string encoding (though one can be
implied from the treatment of DirectoryString) but it is nonetheless
a string data type.

> 
> Therefore, I would not expect a substrings matching rule for 
> certificate or
> jpegPhoto.

For certificate, definitely not since it isn't a string type
(by my definition).

For jpegphoto, maybe, if was defined to be an OCTET STRING,
in which case octetStringSubstringsMatch could be used.

> 
> CommonName is a string. If its value is "Ron" then I would 
> expect that this
> is the target for matching. Even if specified as cn;binary, I 
> would still
> expect the value to be matched to be "Ron" and not #1203526f6e.

The abstract value being matched is "Ron" but we have a number of ways
we could encode the abstract assertion value "Ron" for the matching rule,
e.g. BER, "LDAP encoding for the syntax", DER, CER, XER, PER,
or the generic string encoding rules in my component matching
rules draft. As it happens, X.500 DAP only supports BER (DER & CER
are subsets of BER). LDAP mostly supports "LDAP encoding for the syntax"
and BER, however in a substrings filter item LDAP currently only supports
UTF-8 strings so if the string data type doesn't have an LDAP string
encoding, or the string encoding isn't a UTF8 character string (the LDAP
encoding for OCTET STRING is just the raw bytes), then you can't use the
LDAP substrings filter item. Kurt wants to fix that.

Regards,
Steven
  
> 
> You now say that you wish to match T.61 strings. What have 
> these got to do
> with LDAP? LDAP values are required to be specified in UTF-8. 
> Where are you
> trying to go with this?
> 
> Ron.
> 
> -----Original Message-----
> From: Kurt D. Zeilenga [mailto:Kurt@OpenLDAP.org]
> Sent: Wednesday, 25 October 2000 4:18
> To: Ramsay, Ron
> Cc: ietf-ldapbis@OpenLDAP.org
> Subject: RE: non-UTF8 string Substrings matching
> 
> 
> At 06:04 PM 10/24/00 +1100, Ramsay, Ron wrote:
> >I don't think xx;binary is an appropriate target for this 
> discussion. It is
> >a way of specifying the transfer syntax. The matching rules 
> should address
> >only the (conceptual) stored syntax.
> 
> I have three targets for this discussion:
>   X.500 octetStringSubstringsMatch rule
>   Teletex (T.61) teletexString matching via
>     default mode transfer
>   Teletex (T.61) directoryString matching via
>     ;binary mode transfer
> 
> Due to the UTF-8 restriction of the transfer syntax, these substrings
> assertions are currently not allowed.  In my option, the 
> transfer syntax
> should not be restricted to UTF-8.  I believe that any assertion
> value valid per the matching rule should be transferrable (using
> default or ;binary transfer).
> 
> Many implementations of LDAPv3 ignore the UTF-8 restriction upon
> the substrings transfer syntax and instead restrict the asserted
> value per the matching rule syntax.  Likewise, most LDAPv2
> implementations ignore the IA5 (ASCII) restriction upon the
> substrings transfer syntax.
> 
> This behavior, IMO, is consistent with X.500 and should be allowed.
> Note that allowing it does not require any change to the actual
> BER encoding used in the protocol.  That is, the needed change to
> the ASN.1 is notational.
> 
> >Further, I cannot see the point of extending substring-match to
> non-strings.
> 
> By non-strings, do you meant non-UTF8 strings? or non-textual strings?
> or non- something else strings?
>