[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: RE : RE : Add flag to UTF8normalize and pals to allow accent stripping

On Tue, Feb 26, 2002 at 03:26:57PM +0100, John Hughes wrote:
> Stig writes:
> > I've found a better solution to this problem (IMO). Rather than
> > changing the code, it can be done by only changing the Unicode
> > tables.
> That's rather wholesale isn't it?  I'd even go so far as to say
> it's *wrong*.  I agreed that it's a bad idea to change the (externaly
> supplied) ucdata library, but changing the (standard) unicode data
> files seems even worse.

Well, not using the standard Unicode normalization and matching, is
something that I think users must choose individually and depending on
what language they think the data are in, or what matching they or their
users/applications prefer. If you want to change the standard behavior,
that must be configured somehow. One way would be to change the tables.
The data I suggest changing is only used by slapd for matching. Although
I see the problem that the data might be used for other things in the
future. I believe that this would give you the effect you want though.
I'm making approximate matching ignore accents now, that should help
some. But if you want exact match to not care about accents, then you
want the accent to go away when normalization is done, and the unicode
data describes how this should be done.