[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Problems with case folding of UTF-8



At 10:35 AM 2001-12-22, Michael Ströder wrote:
>Stig Venaas wrote:
>> 
>> adding new entry "cn=Stig Venås, dc=my-domain,dc=com"
>
>Well, you have to tell us that this string is improperly interpreted
>as ISO-8859-1 by your xterm. Otherwise it's meaningless. ;-)
>
>> The DN in base64 is Y249U3RpZyBWZW7DpXMsIGRjPW15LWRvbWFpbixkYz1jb20
>
>Are you sure about that being properly base64-encoded?

echo -n 'Y249U3RpZyBWZW7DpXMsIGRjPW15LWRvbWFpbixkYz1jb20' \
  | b64d | hexdump -C
00000000  63 6e 3d 53 74 69 67 20  56 65 6e c3 a5 73 2c 20  |cn=Stig Ven..s, |
00000010  64 63 3d 6d 79 2d 64 6f  6d 61 69 6e 2c 64 63 3d  |dc=my-domain,dc=|
00000020  63 6f 6d                                          |com|
00000023

(b64d is a alias which uses perl to decode the base64)


>Python 2.1.1 (#5, Nov 18 2001, 17:07:23) 
>[GCC 2.95.2 19991024 (release)] on linux2
>>>> import base64
>>>> base64.decodestring('Y249U3RpZyBWZW7DpXMsIGRjPW15LWRvbWFpbixkYz1jb20')
>Traceback (most recent call last):
>  File "<stdin>", line 1, in ?
>  File "/usr/lib/python2.1/base64.py", line 47, in decodestring
>    decode(f, g)
>  File "/usr/lib/python2.1/base64.py", line 31, in decode
>    s = binascii.a2b_base64(line)
>binascii.Error: Incorrect padding
>>>>
>
>> Ã¥ is å (a with circle above), and should still be one character
>> when normalized (still 2 characters in UTF-8).
>
>For your records Python's UTF-8 encoding:
>
>>>> unicode('Venås','iso-8859-1').encode('utf-8')
>'Ven\xc3\xa5s'
>>>> 

As Stig provided.