[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: ACL regex and Unicode (was: acl again)



At 08:29 AM 2001-11-12, Kurt D. Zeilenga wrote:
>At 05:26 AM 2001-11-10, Pierangelo Masarati wrote:
>>Stig wrote: 
>>> Maybe we should try to normalize that in the code when parsing the rule?
>>> I should look into Unicode and ACLs, we need to decide what to try to
>>> normalize though. Normalization and regexps are a bit awkward, even more
>>> so when taking Unicode into account.

In handling Unicode in DN regex's, there are two simple approaches.

a) "normalized" DN strings contain unescaped UTF-8, regex strings
   must then contain unescaped UTF-8.

b) "normalized" DN strings contain escaped UTF-8, regex strings
   must then contain escaped UTF-8.

There more complex approaches which require UTF-8 (escaped or not)
in the regex to be munged into the form produced by DN normalization
(unescaped or not).  But this is VERY messy and likely very
problematic.

The pro of a) is that we can use normal regex functions, con is that
matching would be ASCII based, not Unicode based.  May be livable.
The pro of b) is that one gets Unicode aware regex but the con is
that this requires Unicode aware regexing (and Unicode configuration
files).

Now, one thing to note is that the ACL normalization of the DN
could be different than the ordinary DN normalization used elsewhere.
This would clearly be needed if we go the a) route as I believe we
should not escape Unicode characters in DNs transferred over the
protocol.  That is, protocol DN escaping should be minimal.

Comments?

Kurt