[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: [ldapext] UTF-8 full support in LDIF / LDIF v2

To: Kurt Zeilenga <Kurt.Zeilenga@isode.com>
Subject: Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
From: Steven Legg <steven.legg@eb2bcom.com>
Date: Wed, 24 Jun 2009 17:54:16 +1000
Cc: ldapext@ietf.org
Delivered-to: ldapext@core3.amsl.com
In-reply-to: <53BFEFC2-BA68-45EE-B77B-F91FDAAD8BE1@isode.com>
References: <49C497F9.7010200@zioup.com> <CD3905D4-2A25-4C56-8187-3CE10D46C929@isode.com> <49C870C6.4010803@zioup.com> <E94B7389-9A6D-4CB6-BB2C-649CCD3FD15B@Isode.com> <49CB192E.5050105@zioup.com> <49CB211C.6070108@eb2bcom.com> <49CB87FE.1050809@zioup.com> <49CC01DE.6040506@eb2bcom.com> <4A24557D.7030006@zioup.com> <4A26A05D.8040105@zioup.com> <245BF18B-2066-4E36-9502-16F4A3140D9E@Isode.com> <4A309775.3080406@zioup.com> <4A311ED1.1030202@stroeder.com> <4A31D27B.3090208@zioup.com> <4A325A40.2050802@stroeder.com> <4A35CDDE.8000604@zioup.com> <4A37719E.3010006@stroeder.com> <4A37A5D9.4040901@zioup.com> <4A3830A6.4030407@eb2bcom.com> <93053DE7-C324-4124-BF8F-B3C7088D66EB@Isode.com> <4A39E563.1040107@eb2bcom.com> <4A3A13B2.6060307@stroeder.com> <4A3AE785.5010509@eb2bcom.com> <53BFEFC2-BA68-45EE-B77B-F91FDAAD8BE1@isode.com>
User-agent: Thunderbird 2.0.0.22 (Windows/20090605)


Kurt,

Kurt Zeilenga wrote:

On Jun 18, 2009, at 6:19 PM, Steven Legg wrote:

The potential for an inadvertent change of normalization in the LDIFv2 if
it is edited doesn't overly concern me. Stringprep takes care of it for
matching purposes


Not for userPassword and the like.


The extended format in the ELDIF specification in its current form can alter
end-of-line characters so I only use it for syntaxes where I know this is
harmless, which basically means XED syntaxes that are known to contain only XML
documents. Since the Octet String syntax doesn't fit this criterion, the
userPassword attribute never uses the extended format. If we generalize the
extended format to allow "here" documents with the unmodified literal LDAP
attribute value, then I would expect the extended format to be limited to
syntaxes that are known to produce exclusively UTF-8 character strings,
which would continue to exclude userPassword.

When I dump userPassword values they are encrypted, so even if the contents
of the octet string were UTF-8 to start with they probably aren't after
the encryption is done with it.

Not for value syntaxes which require a specific normalization to beapplied else result in a syntax error.


I'm not aware of any such syntax.

And, end-of-line characters appearing in values are not required to bebase64'ed or otherwise escaped, there will inadvertent change ofend-of-line characters to deal with.


It's best to be tolerant of such variations anyway since editing by
ordinary LDAP clients could create such inadvertent changes.

LDIFv1 avoided such problems by limiting the characters in values thatcould appear without being base64'ed to a subset of the ASCII subset ofcharacters. These issues haven't gone away since the introduction ofLDIFv1.
and any client that expects attribute values to be in,
or remain in, a particular normalization form is asking for trouble.
If a technical specification says an attribute value is to be in aparticular Unicode normalization form, then all clients supporting thattechnical specification need to be provide the values of that attributein a particular Unicode normalization form.


I don't know of any such specification for an existing syntax that produces
exclusively UTF-8 encodings. It would be most unwise for such a requirement
to be placed on the use of an existing syntax (e.g., Directory String)
because of the installed base of software that just wouldn't honour the
requirement. If it's a new syntax, then it wouldn't be known to existing
ELDIF implementations so attributes of the syntax wouldn't use the
extended format. Depending on the details, when I got around to implementing
it (i.e., making it known) I might explicitly exclude it from using the
extended format and/or have my ELDIF parser renormalize the values it is
importing.

The
values could be modified by some other client that changes thenormalization
during editing and I wouldn't count on every directory implementation
preserving the exact character sequence it is given (though mine does).
If the normalization is specified as part of the LDAP syntax for theattribute value syntax, it follows that there would be a requirement fordirectory servers to preserve that normalization. Or the value might bestored in an octet string (like userPassword) and the server required topreserve the octets and hence the normalization.


Putting it in an octet string would protect it from using the extended format,
as I envisage its applicability.

Regards,
Steven

If a client needs the values to be in a particular normalization form it
should do the conversion itself.
We already have one standard attribute, userPassword, where values (whentext) SHOULD to be provided in a particular Unicode normalization.
-- Kurt

_______________________________________________
Ldapext mailing list
Ldapext@ietf.org
https://www.ietf.org/mailman/listinfo/ldapext

References:
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Yves Dorfsman <yves@zioup.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Yves Dorfsman <yves@zioup.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Kurt Zeilenga <Kurt.Zeilenga@Isode.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Yves Dorfsman <yves@zioup.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Michael Ströder <michael@stroeder.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Yves Dorfsman <yves@zioup.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Michael Ströder <michael@stroeder.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Yves Dorfsman <yves@zioup.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Michael Ströder <michael@stroeder.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Yves Dorfsman <yves@zioup.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Steven Legg <steven.legg@eb2bcom.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Kurt Zeilenga <Kurt.Zeilenga@Isode.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Steven Legg <steven.legg@eb2bcom.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Michael Ströder <michael@stroeder.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Steven Legg <steven.legg@eb2bcom.com>
- Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
  - From: Kurt Zeilenga <Kurt.Zeilenga@isode.com>

Prev by Date: Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
Next by Date: Re: [ldapext] UTF-8 full support in LDIF / LDIF v2
Index(es):
- Chronological
- Thread