[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Revisited: NON-ASCII chars in userPassword



Couldn't a server just provide an administrative setting that it uses to specify the character encoding for the simple password? It's *very* difficult to make a case for changing the ASN.1 in the specification. At the very least, we'd need to determine what ill-effects this would have on existing implementations. Worst case it that we have to rev the protocol doc.

Any other opinions?

Jim

>>> Michael Ströder <michael@stroeder.com> 10/29/01 02:25PM >>>
HI!

Sorry, for catching up on a really old discussion after months
started on ldap@umich.edu list. But I think it's necessary and I
think LDAP-BIS is the right forum for that.

Jim Sermersheim wrote:
> 
> >You can't complain to both Netscape and Outlook as being "non-standard".
> >You must choose which of the two vendors is "non-standard".  Either the
> >simple bind uses 8859-1 or it uses UTF-8.  There MUST be a character
> >symbol mapping to the octetString representation.
> 
> The simple bind allows a string of *any* octets. After reflecting
> on Kurt's message, the octet-string representation of both 8859-1
> and UTF-8 are both subsets of *any* octets, thus both are compliant

Yes. But I consider it to be a flaw in LDAP design that both are
compliant.

> (just restrictive)

Well, passwords are normally entered by a keyboard. Therefore
passwords *are* limited to what you can enter on any keyboard.

> For those servers restricting the string to
> containing characters (whether 8859-1 or UTF-8) I believe there are well 
> understood ways of transmitting those characters as octet strings.

Not the server is restricting the strings. The clients are free
choosing any character set they want.

> ? I think this field is transmitted by all vendors as an octet string.

Yes.

> Some servers don't place any restriction on the octets that

I think no server places any restriction on the octets in
userPassword or the credentials passed in with simple
authentication.

> >Actually I prefer treating userPassword as the UTF-8 representation of
> >the characters typed.
> 
> This restricts me from using machine-generated passwords that are
> made up of arbitrary strings of octets.

We are not talking about arbitrary credentials for arbitrary
authentication mechanisms. We're just talking about normal passwords
for simple authentication.

Or what strange kind of keyboard do you have to enter null-bytes?
Ok, if you have too much time you can enter escape-sequences with
hex-codes... ;-) But what the underlying OS makes out of it is a
completely different thing.

To avoid further misunderstanding:

Yes, the encoding of the password transmitted on the wire in a
BindRequest is OCTET STRING. That's good and shouldn't be changed.
Period. This is not the issue!

But a user usually enters a password through the keyboard. The key
code is mapped by the operating system into some internal code, the
local code page. In former X.500 times it was considered that the
DUA and the DSA would have some "local agreement" about the
character set and character encoding (don't mix up with transfer
encoding in LDAP message). In the most simple form this local
agreement could be running on the same OS leading to the DUA and the
DSA using just the same local code page for the password without
defining anything else. Therefore X.500 did not impose any rules
about the character set used in the credentials for simple
authentication. LDAP borrowed this from X.500 without further
thinking.

Without being an expert in this X.500 history IMHO this was never
ever practical for anything else than using ASCII characters for
passwords.

The situation today is that there is nothing like a local code page.
Even if the LDAP client (DUA) and the LDAP server (DSA) are running
on the same machine you cannot assume that they use the same
character set. Hence a general agreement about which character set
to use should be made. Vendors today have several options:
- Do what they want (take ISO-8859-1 or UTF-8 or name another
favourite exotic character set)
- Limit user input to ASCII chars (e.g. Admin console of Netscape
Directory Server)
- Some weird LDAP hackers might suggesting using several different
charsets since userPassword is multi-valued (evil grin).

The only really practical solution today seems to be using solely
ASCII chars. But I consider that users might have keyboards without
a single ASCII character on it. Also being able to use NON-ASCII
chars in passwords is good for password-strength.

Therefore I'd vote for not hard-coding OCTET STRING in credentials
for simple bind. Instead of the AuthenticationChoice-definition in
RFC2251 I'd like to propose:

        AuthenticationChoice ::= CHOICE {
                simple                  [0] LDAPString,
                                         -- 1 and 2 reserved
                sasl                    [3] SaslCredentials }

LDAPString as being defined in section 4.1.2. of RFC2251.

IMHO this would not be a real compability problem since most LDAP
clients are already doing it exactly like this in absence of a
clearly defined charset.

Ciao, Michael.