[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: international chars in attributes



At 08:43 AM 5/21/01, Goetz Golla wrote:

>It has been noted several times on this list that in LDAP v3 attribute
>values with special characters are stored UTF-8 and base64 encoded.

Please note that LDAPv3 requires use of UTF-8 on the wire.
LDIFv1 uses base64 to encode data is not printable.
LDIFv1 is a file format, not a wire representation.

As far as how data is stored (by either peer) is completely
up to the peer.


>I have two questions about this:
>
>(i) Whenever I have an ldif file with umlauts in it, I get the message:
>ldap_add: Invalid syntax
>        additional info: value contains invalid data

This implies that the value doesn't conform to the syntax
restrictions of the attribute.  Attributes of directoryString
syntax are restricted to UTF-8.

>Is it possible to make openldap do the conversion from some international
>charset to UTF-8/base64 itself, or do I have to do this manually before
>submitting an ldif file ?

You need to.  OpenLDAP assumes data provided to it conforms to
LDIFv1 and LDAPv3.

>(ii) When I have an entry which is base64 encoded, I understand that the
>client has to do the decoding itself.
>When talking to an X.500 Server with an LDAP Port I can recognize the
>encoded entries by two ':' after the attribute name, e.g.:
>
>cn:: LKHSfoiawdkasnkasjhf==
>
>  ^^
>
>However, OpenLDAP does not seem to behave like that. Then, how can the
>client know if it should decode the value, or not ?

A client directly interacting with LDAPv3 doesn't need to deal with
LDIFv1 its base64 representation.   An application reading an LDIFv1
file needs to handle LDIFv1 base64 encoded values.  The application
knows that it is base64 as it starts with '::' instead of ':'.

Note that if a client adds:

cn: LKHSfoiawdkasnkasjhf==

Then the value itself contains those characters, as opposed to
being whatever that base64 decodes to.