[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Problems with case folding of UTF-8

To: michael@stroeder.com
Subject: Re: Problems with case folding of UTF-8
From: Pierangelo Masarati <masarati@aero.polimi.it>
Date: Sat, 22 Dec 2001 19:43:25 +0100 (MET)
Cc: openldap-devel@OpenLDAP.org
In-reply-to: <3C24D333.1AC14254@stroeder.com> from Michael =?iso-8859-1?Q?Str=F6der?= at Dec "22," 2001 "07:38:43" pm

> Pierangelo Masarati wrote:
> > 
> > > Pierangelo Masarati wrote:
> > > >
> > > > Can you, Stig and Michael, provide a set of strings that do not
> > > > work, so that I can try to see what's going on?
> > >
> > > Well, Ströder (hopefully properly encoded as ISO-8859-1 in this
> > > e-mail) is one. The hex-escaped string representation produced by
> > > Python's UTF-8 Unicode codec is:
> > >
> > > 'Str\xc3\xb6der'
> > >
> > > Furthermore here are all the german umlauts (each two bytes long):
> > >
> > > 'äöüÄÖÜß' ->
> > >
> > > '\xc3\xa4\xc3\xb6\xc3\xbc\xc3\x84\xc3\x96\xc3\x9c\xc3\x9f'
> > 
> > I guess you also need to omit the 'x' right? '\c3\a4' and so ...
> 
> Depends what your interpreter/compiler is:
> Read Python's string representation '\xc3' as being a string of
> length one with the single byte of value [use your favourite
> hex-representation of decimal 195 here].

I'm talking of RFC2253: '\'<hex><hex>, with '\'<special> being the
"usual" escaped chars (',' '+' ';' ...)

Pierangelo.

References:
- Re: Problems with case folding of UTF-8
  - From: Michael Ströder <michael@stroeder.com>

Prev by Date: Re: Problems with case folding of UTF-8
Next by Date: Re: Problems with case folding of UTF-8
Index(es):
- Chronological
- Thread