[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: [ldapext] UTF-8 full support in LDIF / LDIF v2




Ok, let's split the two issue:

First, is it worth amending the standard to allow non-base64 UTF-8 ?
(forget about multi-line attribute value for a moment).

Looking back (way back) at this thread, I am not the only one seeing value here:


Michael Ströder wrote:
I'm not convinced that removing the ASCII restrictions will be a good
thing. Not only do I doubt it will have a net positive on displayability of LDIF for those who have a displayability goal (I don't this goal), I think it will have a net negative impact on interoperability and user confusion, such as when the user creates a file using one Unicode normalization algorithm, but is trying to set values which require a different Unicode normalization value.
How so ? In the current version, you have to encode your Unicode to
UTF-8, and then encode it again to base64. With my proposal, you would
get the exact same UTF-8 strings as you do today, but they would not be
(or would not have to be) encoded in base64.

I agree with Yves here.


Steven Legg wrote:
LDIF is first and foremost an interchange format.  Conversion from LDAP
PDU->LDIF Record->LDAP PDU MUST produce as output the input, octet for
octet for every "data" component (the DN, every attribute description
and associated values, etc.).

That's highly desirable for directory to directory interchange, but LDIF is also used for composing data from various data sources to put in a directory and to extract data from a directory to put in other data sources. The octet-for-octet preservation usually doesn't apply in these other cases and the need to turn line-based data such as XML documents into base64 encodings is a serious impediment, hence the reason Andrew and I wrote the Internet-draft.


Ludovic Poitou wrote:
I went back through the mailing list archive, "charset" came up, but I
>> can't make sense of who started with it.

> I probably did.
> In europe there are lots of directory users and administrators that have
>  non ascii data they need to transform to LDIF.
> With simple scripts, turning the data to UTF-8 and then to base64 encoded
> is a pain. Allowing to specify a charset and then letting the tools doing
> the conversion to UTF-8 automatically could simplify their life.




Can we try to think of what sort of problem non-encoded UTF-8 would create ?
If there is none, than, could we implement a version 2 that does this ?
The people that don't care for it, can carry on using version 1 ?

--
Yves.
http://www.sollers.ca/

_______________________________________________
Ldapext mailing list
Ldapext@ietf.org
https://www.ietf.org/mailman/listinfo/ldapext