[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Problems with case folding of UTF-8

To: Howard Chu <hyc@highlandsun.com>
Subject: Re: Problems with case folding of UTF-8
From: Stig Venaas <Stig@OpenLDAP.org>
Date: Mon, 10 Dec 2001 23:16:00 +0100
Cc: Pierangelo Masarati <masarati@aero.polimi.it>, Stig@OpenLDAP.org, openldap-devel@OpenLDAP.org
Content-disposition: inline
In-reply-to: <005b01c181c4$0b1e6900$0c01a8c0@fiddle.symas.com>; from hyc@highlandsun.com on Mon, Dec 10, 2001 at 01:45:54PM -0800
References: <200112101810.fBAIAak17755@server.aero.polimi.it> <005b01c181c4$0b1e6900$0c01a8c0@fiddle.symas.com>
User-agent: Mutt/1.2.5i

On Mon, Dec 10, 2001 at 01:45:54PM -0800, Howard Chu wrote:
> This makes sense to me. I wonder why we should be forced to choose a longer
> representation; as long as our conversion is self-consistent (always chooses
> the same representation) we should be free to choose the form we want.

Unicode normalization is not exactly straightforward, there are
complications, I'm not quite sure how to do this consistently for all
characters in all the different scripts based on the Unicode tables.
Please read up on how normalization works. I don't think it would be
worth the effort. The only thing you are solving, is the need for
allocating new memory when the normalized string is longer. The only
problem I see is performance. We might need better memory handling.

Stig

Follow-Ups:
- RE: Problems with case folding of UTF-8
  - From: "Howard Chu" <hyc@highlandsun.com>

References:
- Re: Problems with case folding of UTF-8
  - From: Pierangelo Masarati <masarati@aero.polimi.it>
- RE: Problems with case folding of UTF-8
  - From: "Howard Chu" <hyc@highlandsun.com>

Prev by Date: RE: Problems with case folding of UTF-8
Next by Date: RE: Problems with case folding of UTF-8
Index(es):
- Chronological
- Thread