[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: more SASLprep/protocol problems



Hallvard,

This text is intended to bring LDAP into conformance with RFC 2277
(BCP18) as well as improve matching of text as discussed in RFC 3454.

RFC 2277 mandates that protocols support UTF-8 for all character
data. While RFC 2277 allows protocols to support other character set
and encodings forms, LDAP (as currently specified) provides no
mechanism to enable this.  That is, there is no mechanism in LDAP
to negotiate which character set/encoding is in use.

We cannot simply allow implementations to do as this prevents
interoperability (in the IETF sense).
 


At 09:41 AM 10/1/2003, Hallvard B Furuseth wrote:
>Kurt D. Zeilenga writes:
>> I think here are opposing objectives.   One fraction of the
>> community is attempting to improve interoperability between
>> independently developed implementations.  One faction is
>> attempting to support legacy systems.
>
>Are there some implementations that do prepare passwords, or at least
>translate them to UTF-8, and are widely used on non-ASCII or ASCII-
>superset sites with hashed passwords, and thus prove that the problem
>is smaller than I think?

There are many implementations which prepare passwords
in some way....  I know of implementations which:
        transcode EBCDIC to ASCII,
        transcode T.61 to Unicode/UTF-8,
        re-encode UTF-16 to UTF-8,
        apply a Unicode normalization transformation

>> (Note: It is possible to split the preparation between the client
>> and the server, PLAIN does this (client handles transcoding
>> to Unicode and UTF-8 encoding, server handles SASLprep).  With
>> LDAP, this is not a good option as the server has no way to
>> determine whether the password is textual or not.)
>
>I've seldom wondered about so many things in so short a statement:-)
>
>What do you mean by textual and non-textual passwords, exactly?

A textual password consists of character data.

>Why are PLAIN passwords more textual than LDAP passwords?
>Is it simply because PLAIN _requires_ passwords to be UTF-8 which is presumably text, while LDAP only will recommend it?

PLAIN passwords are character data.
LDAP passwords are octet strings which sometimes represent character data.

>If the server and therefore the sysadmin doesn't know whether to prepare
>a bind password or not, how is the client supposed to know - unless the
>user tells it?

Clients (whether used by end-user or sysadmin) have, in
general, knowledge of character set and encodings they are
using to interact with the user.  That knowledge is NOT
communicated by the client to the server.

I note as well that existing external passwords stores do not
maintain knowledge of the character set/encoding used by the
admin, they just store whatever octet string the password-setting
application provides.  The assumption is that the authenticating
application will use the same character set, normalization,
encoding algorithm of the password-setting application.

This assumption is flawed in that;
        a) users may use different platforms and/or platform
          settings then their administrator,
        b) users may use different platforms and/or platform
          settings from time to time.

To deal with this flawed assumption, many deployments are forced
not internationalize.  Instead they use the lowest common
denominator (commonly some subset of ASCII).

The internationalization of LDAP passwords takes this into
account.  The specification preserves the encoding of
printable ASCII passwords.

>Which scenario are you thinking of when you say the server doesn't know?

Most.

>Not that I disagree that client-side preparation is most flexible,
>but... usually all passwords will be UTF-8 or none of them will,
>depending on how the sysadmin put them there, so the server will know.

Clients I have used lately have encoded by password (internally)
using:
        ISO 8559-1, UTF-8, UTF-16, UTF-32.

(I didn't used an EBCDIC client lately).  Luckily the non-UTF-8
clients prepared the password before transferring it.  The server
has no knowledge of this.  It simply treats my password as an
octet string.

>OTOH, the server won't know if they should be prepared if the users
>stored userPasswords as they pleased, both UTF-8 and non-UTF-8.
>But then you _really_ need a client option to say whether or not to prepare
>bind passwords, so I don't suppose that's what you are talking about...

OTOH, say each of my clients just sent password as they pleased
with the expectation that the server deal with.  The problem is,
of course, there is nothing in the protocol which indicates that
the password field is character data and, if so, which character
set/encoding was used.  So, the server would have to guess.

>(Hey! That's another argument for recommending that client option!:-)

Actually, I think you just made a good argument for why we
cannot allow clients to as they please.

>> How's this?
>>
>>      The simple form of an AuthenticationChoice specifies a simple
>>      password to be used for authentication.  To improve matching
>>      of textual passwords, clients SHOULD prepare textual passwords
>>      for transfer by transcoding to [Unicode], applying [SASLprep],
>>      and encoding as UTF-8.  Clients MAY prepare textual passwords
>>      using other algorithms (including null preparation) to support,
>>      for instance, legacy systems.  Non-textual passwords MUST NOT
>>      be mutated.
>
>I should mention that I do prefer that text over the current one, though
>as you know I still want options.  I'd prefer to replace "legacy
>systems" with something like "servers using existing hashed password
>stores", since "legacy systems" has rather negative connotations.

How about s/legacy systems/external password stores/ instead.
I rather not use the term "hashed".

>Also I wonder what "Non-textual passwords MUST NOT be mutated" means, see
>comments above.

It means that implementations are not to alter non-text
(non-character data) passwords.