[Date Prev][Date Next]
Re: ACL regex and Unicode (was: acl again)
On Mon, Nov 12, 2001 at 08:29:53AM -0800, Kurt D. Zeilenga wrote:
> >I read in some posting from Kurt that there are plans to implement
> >Unicode regex; I guess you're working on that. I agree we'd try to
> >normalize ACLs as well.
> I note that one has to be extremely careful mucking with regular
> expressions. What we need to do is to better normalize the DN being
> matched against such that we might be better able to normalize
> the asserted regex. For example, by requiring specials in values
> to be hex escaped instead of simple backslash escaped means that
> the only comma in the stored value is the RDN separator and that
> MAY mean one can safely remove all spaces after commas in the asserted
> regex. We need to be conservative here, that is, we should only
> do those regex rewrites which we can prove are safe.
I agree, it's really a rat hole. If say a regexp looks for DN with
space, should one then assume that the user only wants DNs that have
whitespace after we normalize DNs or also DNs that were entered with
a space after ,.
I'm also wondering a bit about Unicode and regexps. If a user looks
for say A.* we should probably not find strings that start with A"
(umlaut A). And what is for instance white space and numeric Unicode
characters, and what about intervals... I'm not convinced regexp is
well defined for Unicode, but there are regexp code for the library
we use, I'll look at that.
These are just things I can think of right now, I think this is really
tricky stuff that we need to ponder on.