[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: International characters (OpenLDAP v2.0, PHP 4)

>>>>> "Howard" == Howard Chu <hyc@highlandsun.com> writes:

    >> And the LDIF looks like: ----- s n i p ----- dn:
    >> uid=test2%bayour.com,o=Turbo Fredriksson uid: test2%bayour.com
    >> givenname: Ãrjan

    Howard> This is not a UTF-8 encoded string. It kind of looks like
    Howard> ISO8859-1. In UTF-8 the character 'Ã' would be encoded as
    Howard> two separate octets 0xd8 0x83.

Actually, this was a problem with the cut and past... Emacs seemed to
'translate' the character...

The character I see in the browser is a uppercase A with a tilde (~) above,
a dash (-) then comes 'rjan'...

I'm doing the development in Linux, from a non-graphical environment, but
the webbrowser is IE (6.0) beside me (so I can see two things at a time).

Doing the cut-and-past from the win machine, sending it to the devel/workstation
it (the 'LDIF') looks like this. It still don't look like I see it in the
browser (I'm attaching the ldif, maybe it comes through clean there).

--- DEBUG: pql_user_add - user creation(normal) ---
dn: uid=test2@bayour.com%bayour.com,ou=People,o=Turbo Fredriksson
uid: test2@bayour.com%bayour.com
givenname: Ã?rjan
sn: Ã?stlund
accountstatus: active
mail: test2@bayour.com
uidnumber: 500
gidnumber: 500
gecos: Ã?rjan Ã?stlund
cn: Ã?rjan Ã?stlund
userpassword: {SHA}qUqP5cyxm6YcTAhz05Hph5gvu9M=
homedirectory: /var/mail/users/
mailhost: papadoc
deliverymode: localdelivery
mailmessagestore: /var/mail/users/
objectclass: inetorgperson
objectclass: pilotperson
objectclass: posixaccount
objectclass: qmailuser
--- DEBUG ---

Viewing the file in less will reveil two 'control characters' which
less encodes as '<C3>' and '<96>'. Using 'iconv' to convert 'Ö' from
ISO-8859-1 to UTF-8 will give the same two 'control' characters in less...

Setting the locales (LC_CTYPE) to 'sv_SE.UTF-8' and re-reading the file
with less will only show ONE control character (the '<C3>' one).

    Howard> Whatever tool you used to convert text to UTF-8 is broken,
    Howard> or you're using it incorrectly.

Well, I'm using PHP 4.1.2 and it's 'utf8_{encode,decode}()' function(s). 
I've tried the 'accept-charset="UTF-8,ISO-8859-1"' (previos mail from
Michael Ströder) added to the form tag (with and without ISO-8859-1)
but I still get 'Invalid syntax'...

Does it matter that I use OpenLDAP 2.0, not 2.1?

Attachment: test.txt.gz
Description: LDIF from win