[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: invalid syntax when teletexstring

2011/7/29 Howard Chu <hyc@symas.com>:
> Howard Chu wrote:
>> Erwann ABALEA wrote:
>>> Do you have any document or pointer to understand the task of
>>> converting to/from T.61, and incompatible character sets you talked
>>> about? I Googled for this, but I'm not sure of what I found (what I
>>> found reminds me of old character sets we used many years ago in
>>> France for the Minitel, with G1/G2 character groups, etc, not that far
>>> from VT consoles).
>> You can reference this old draft; I wrote Appendix A and B to document the
>> mapping as we understood it at that time. These Appendices were dropped
>> from
>> the final version because it was considered futile to attempt to document
>> the
>> T.61 character encoding rules.
>> http://tools.ietf.org/html/draft-ietf-ldapbis-strprep-00#appendix-A
>> You can also read libldap/t61.c; the code has been present in every
>> OpenLDAP
>> release since 2002 but is not compiled or used.
> This Guide has a pretty good discussion of the issues.
> http://www.cs.auckland.ac.nz/~pgut001/pubs/x509guide.txt
> The section on "Character Sets" is particularly relevant. The section on
> "Comparing DNs" is somewhat relevant, though in fact OpenLDAP has already
> solved this problem (for all the string types besides T61String) by doing
> all matching in UTF-8.

Thank you for the pointers. I appreciate Peter's writings, and already
read this text, some time ago, but wasn't focused on T.61 then.
OpenSSL in its 1.0.0 version internally stores the named in UTF8,
"semi-normalized" form (useless spaces removed, everything is
converted to lowercase, but no NFC/NFD normalization is done).

I'm reading now libldap/t61.c. I just read the IETF draft, and the
numerous tables... What a mess. X.680 has a reference to T.61
recommendation, which was deleted some years ago, and I'm not clever
enough to make Google find a copy of the standard. It can't be bought
anymore from ITU, but it's still referenced by later standards. Nice.

Meanwhile, I still haven't found the Czech CSCA certificate, but I
know what to do with the remaining 1% uncertainty. The CN field is
encoded as T61String, to hold the "CSCA_CZ" value. That fits well
within the 7bits limit.

If everything is internally converted to UTF8 and t61.c seems to
provide a lossless T.61 to UTF8 conversion, why isn't it used?