[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: =+= bug (ITS#269)



Well, turns out you have to muck with a global variable to
make GNU regex behave in a strictly POSIX manner... and I
care not to do that.  As such, I'm focusing my efforts on
testing the code as currently committed.

Kurt

At 01:23 AM 8/24/99 GMT, Kurt@OpenLDAP.org wrote:
>At 12:22 AM 8/24/99 GMT, noel@burton-krahn.com wrote:
>>Hi Guys,
>>
>>I think I know where the problem is coming from.
>>
>>When doing a pattern search, slapd escapes the search string and
>>passes it to the regex library.  The function
>>servers/slapd/filterentry.c:strcpy_special() is supposed to escape
>>chars which the regex library would otherwise construe as special,
>>such as [],*,., etc.
>
>Yes, the strcpy_special() routine was apparently not updated when
>we switched from BSD regexp to POSIX regex.  I committed a patch
>to devel (HEAD branch) which updates the code to use REG_EXTENDED
>(instead of REG_BASIC) and updated strcpy_special (renamed to
>strcpy_regex()) to escape regex operators.  We'll need to do
>some testing to ensure that the operator set is correct.  In
>particular, I am not sure if ')', '}', and ']' should be
>escaped.  Per re_format(7) (HS POSIX regex), ^.[$()|*+?{\ should
>be escaped.  (I believe this is correct per POSIX semantics).
>However, GNU Regex documents leads me to believe that '}'
>and ']' need to be escaped when -lgnuregex (or -lc with GNU
>regex) is being used.  re_format(7) also states that escaping
>patterns with escaped non-operator characters match the non-operator
>character.  As such, escaping ')' and '}' unnecessarily should
>not cause problems.
>
>I won't apply the patch to OPENLDAP_REL_ENG_1_2 until I get back
>some positive feedback.  Here is the -devel patch.  It will likely
>apply without problem to 1.2 sources.
>
>http://www.openldap.org/devel/cvsweb.cgi/servers/slapd/filterentry.c.diff?r1=1.9&r2=1.10
>
>>The '0' selects POSIX Basic Regular Expression
>>syntax.  In my regex library, that means that '+' is an oridinary
>>char, while '\+' is a meta for match-one-or-more-times.
>
>This, I believe, is an GNU regex extension to include all the
>functionality that's "compatible"...  However, both + and \+
>should match + when using REG_BASIC.  It appears though
>that one has to specify RE_SYNTAX_POSIX_MINIMAL_BASIC|RE_SYNTAX_NO_GNU_OPS
>and not REG_BASIC to get POSIX Basic Regular Expressions.
>
>Since I switched the code to POSIX extended regex, I believe we
>need to use REG_EXTENDED|RE_SYNTAX_NO_GNU_OPS.
>
>Kurt
>
>
>