[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: "space" handling in "case ignore match" string matching rule?



Date forwarded: 	Sun, 14 Mar 1999 08:43:03 -0800 (PST)
Date sent:      	Sun, 14 Mar 1999 08:42:50 -0800
From:           	Jeff Hodges <Jeff.Hodges@Stanford.edu>
To:             	ietf-ldapext@netscape.com
Copies to:      	Jeff Hodges <Jeff.Hodges@Stanford.edu>
Subject:        	"space" handling in "case ignore match" string matching rule?
Forwarded by:   	ietf-ldapext@netscape.com

Jeff
I am circulating this to the X.500 group as well as the LDAP group, 
as I suspect a defect in X.500 that has been carried over into LDAP.

Specific comments intersperced with your text below
David

> This question has to do specifically with the handling of spaces in "case
> ignore match" string matching rule as applied in the context of a
> substring filter (i.e. a filter assertion string containing wildcards and
> space character(s))...
> 
> 
> In RFC2251, filter processing is discussed as a blurb towards the end of
> section "4.5.1. Search Request". At the end of that blurb, there's this
> reference..
> 
>      "More details of filter processing are given in section 7.8 of X.511
>       [8]."
> 
> 
> In section "7.8.2	Filter item", ITU-T Rec. X.511 (1993 E) sez..
> 
> A FilterItem may be undefined (as described above). Otherwise, where the
> FilterItem asserts: a)	equality - It is TRUE if and only if there is a
> value of the attribute or one of its subtypes for which the equality
> matching rule applied to that value and the presented value returns TRUE.
>  b)	substrings - It is TRUE if and only if there is a value of the
>  attribute
> or one of its subtypes for which the substring matching rule applied to
> that value and the presented value in strings returns TRUE. See ITU-T Rec.
> X.520 | ISO/IEC 9594-6 for a description of the semantics of the presented
> value.
> 
> 
> ITU-T Rec. X.520 (1993 E)  sez...
> 
> 6.1	String matching rules
> In the matching rules specified in 7.1.1 through 7.1.11, the following
> spaces are regarded as not significant: ---	leading spaces (i.e. those
> preceding the first printing character); -	trailing spaces (i.e. those
> following the last printing character); -	multiple consecutive internal
> spaces (these are taken as equivalent to a single space character). In the
> matching rules to which these apply, the strings to be matched shall be
> matched as if the insignificant spaces were not present in either string.
> 
> 
> (besides the "7.1.1 through 7.1.11" actually meaning "6.1.1 through
> 6.1.11", I suspect)
> 

Correct, this is also a defect, but a typo one, not a logical one or an 
interpretative one

> 
> ..so, the above taken together implies to me that if I have a set of
> entries like so..
> 
> 
>   cn=afs a test
>   cn=afs b test
>   cn=afs c test
>   cn=afs d test
>   cn=afs e test
>   cn=afs f test
>   cn=afs g test
>   cn=afs h test
>:
> 
> Then a filter of "(cn=* f* test)" ought to find ~only~ "cn=afs f test",
> unless the space in the leading "* " portion of the filter is evaluated as
> being equivalent to a "leading space" and the filter is conflated to
> "(cn=*f* test)" prior to actual application. 

I suspect that on reading the standard this interpretation could be 
arrived at. This is because in X.500 the filter is actually expressed in 
ASN.1 as

	substrings	[1]	SEQUENCE {
	type	
ATTRIBUTE.&id({SupportedAttributes}),
	strings	SEQUENCE OF CHOICE {
	initial	[0]  ATTRIBUTE.&Type
	
({SupportedAttributes}{@substrings.type}),
	any	[1]  ATTRIBUTE.&Type
	
({SupportedAttributes}{@substrings.type}),
	final	[2]  ATTRIBUTE.&Type
	({SupportedAttributes}{@substrings.type})}}



i.e. put simply as 
substrings, SEQ{ any, " f", final "test" }

in which case the " f" could be legitimately truncated to "f" by 
chopping off the leading space in the filter as in the text of the 
standard that you quoted above. I would argue however that this is a 
mis-interpretation of leading space. Leading space should refer to 
the leading space in the attribute value and in the initial substring 
filter, but not in the any or final substrings. Therefore I suggest that 
as a minimum the standard needs clarification of the text, as it does 
not differentiate leading space for substrings filter. For that matter, 
there is also probably a similar defect, in that an initial substring 
filter of "f " might be truncated to "f" by having its trailing space 
removed. This should not happen, but it should be removed if the 
filter is final "f ".

David



> 
> If the latter occurs, then the results of such a search against the test
> entries above simply returns all the test entries (which isn't exactly
> what I/we were expecting). 
> 
> Is this behavior "correct" according to other's interpretation of RFC2251
> (+ X.511 & X.520, as referenced by RFC2251)?  I.e. I'm wondering if the
> particular directory server implementation we're using is handling this
> filter assertion string correctly or not. 
> 
> thanks,
> 
> Jeff
> 
> 
> --
> Jeff Hodges                                   Jeff.Hodges@Stanford.edu
> Senior Technical Staff                          voice: +1 650 723 2452
> Directory Services and PKI                        fax: +1 650 723 0908
> Stanford University                   http://www.stanford.edu/~hodges/
> 
> 


***************************************************

David Chadwick
IT Institute, University of Salford, Salford M5 4WT
Tel +44 161 295 5351  Fax +44 161 745 8169
Mobile +44 370 957 287
Email D.W.Chadwick@iti.salford.ac.uk
Home Page  http://www.salford.ac.uk/its024/chadwick.htm
Understanding X.500  http://www.salford.ac.uk/its024/X500.htm
X.500/LDAP Seminars http://www.salford.ac.uk/its024/seminars.htm
Entrust key validation string MLJ9-DU5T-HV8J

***************************************************