[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Stringprep Considered Harmful



On 15-Nov-04, at 1:31 AM, Kurt D. Zeilenga wrote:

But one has to consider whether those same things are reasonable
for directory strings in general.  In particular, one has to
carefully consider the Security Considerations outlined in 9.1
of [Stringprep] apply to directory strings.

Indeed. I have read that section several times. It appears to be primarily concerned with visual spoofing attacks, which are obviously relevant for domain names. However, I note that the bidi prohibition only deals with one, fairly small, class of such attacks, and possibly does not deal with all of those (viz. the discussion last year on the W3C IRI mailing list.)

The ldapbis strprep draft says in its abstract that the motivation
for the specification is interoperability; there is no mention of
security as a primary motivator. Interoperability does not require
bidi prohibition; Unicode strings are always stored in logical
order, which is precisely the order in which they should be compared,
regardless of how many directionality runs there are.
(And consequently, to answer a previous question, mixed directionality
text has no impact on substring matching.)

Other than the bidi prohibition rule, the remaining specifications,
although well-intentioned people may disagree about details (such
as which if any compatibility decompositions should be required)
are highly relevant to the goal of interoperability, and any standard
would be welcomed (at least by me) as better than no standard at all.

Bidi prohibition therefore stands out as an odd exception in the specification.

I don't know what consultation the working group carried out in the
decision to adopt the bidi prohibition rule but I hope you will not be
offended by my saying that the messages you refer to (I presume these
are http://www.openldap.org/lists/ietf-ldapbis/200312/msg00087.html
and http://www.openldap.org/lists/ietf-ldapbis/200401/msg00019.html)
do not appear to indicate a detailed consultation with a number of
experts ("after chatting with Paul Hoffman again") or careful consideration
by the working group (the fact that no concerns were expressed between
December 19 and January 5, which I believe are two weeks in which most
people in the western world are thinking about something other than
string normalization.)


It seems odd to me to propose to use bidi prohibition for strings
which might have multiple components. Both of the bidi prohibition
applications I am aware of, the nameprep profile and the IRI internet
draft (http://xml.coverpages.org/draft-duerst-iri-09.txt)
are quite clear that the bidi prohibition rule should be performed
component by component, and not on strings as a whole, where it would
obviously be counterproductive.

The IRI draft, in particular, provides a number of useful recommendations
for the display of IRIs in order to avoid visual spoofing, but even
after articulating the bidi prohibition rule, it notes (p. 18):


    The above restrictions are given as shoulds, rather than as musts.
    For IRIs that are never presented visually, they are not relevant.
    However, for IRIs in general, they are very important to insure
    consistent conversion between visual presentation and logical
    representation, in both directions.

IRIs (perhaps less so than FQDNs) are used in contexts where visual
spoofing attacks are high-profile. But I wonder if a similar argument
can be made for general-purpose directory entries.

In fact, it is not clear that LDAP as a protocol should concern itself
at all with what is essentially a question of how to visually render
information rather than how to store, transport, and retrieve it.
Of course, guidance on rendering such information would always be useful.


It would seem
these considerations apply well to values commonly used for naming
and in security applications (such as PKIX), and even descriptive
text.

I'm afraid this doesn't seem to me to be a sufficient justification for
bidi prohibition. Descriptive text can be a complete lie, regardless of
whether it contains bidi characters. There are an enormous number of
visual spoofing attacks which have nothing to do with the bidi rendering
algorithm (for example, the use of Greek and Cyrillic characters which
render identically to Latin characters) which are both easier to accomplish
and do not depend on the victim of the spoof being fluent in two languages
with differing writing directions.


Regardless of the BIDI issue, caseIgnore*Match/caseExact*Match
are inappropriate rules for matching URLs as the rules are not
aware of the particulars of the URL syntax, such as escaping.

I accept that. However, there does not appear to be any proposal for a syntax useful for URLs, and it is likely that DirectoryString will be used in real applications.

There is no standard way to use Arabic or Hebrew in
mailbox names.  The existing 'mail' attribute is restricted
to IA5

There are many people (at least in the directories I work with) whose mailboxes are not in IA5. I think this is increasingly common, whether or not there is a standard. I wonder how typical LDAP administrators deal with this; the use of DirectoryString syntax seems highly probable to me.

As for domain components, in LDAP, domain components
should be IA5 strings (constrained to the appropriate
LDH subset).

In that case, the bidi prohibition is not necessary for the dc attribute. I stand corrected. (It is also true that distinguished names SHOULD be stored in attributes with DN syntax. However, I believe there are cases where they are not; for example, a DN might appear as part of the value of a description attribute.)

Though you didn't detail what kind of file path you were
referring to, but I would suspect that case{Ignore,Exact}*Match
are inappropriate for most if not all common file path syntaxes
(due to significant space issues, escaping, and various other
issues).

This is true with the application of the LDAPprep profile. However, the filesystems I most commonly use do not employ escape characters (indeed, I am not aware of a filesystem which does) and could be appropriately be matched by one of case{Ignore,Exact}Match aside from the issue of insignificant spaces. I believe that the majority of traditional Unix filesystems would be better matched with a pure octet string match, but the Macintosh filesystem I use daily uses case-folding followed by Unicode normalization; it differs in some details from the LDAPprep proposal, but in most cases (barring the bidi prohibition) the results will be quite similar. Insignificant spaces in filepaths (and other strings) are another instance of visual spoofing.

Again, since there do not appear to be clear alternative matching
rules and syntaxes, it is likely that administrators will make do
with what there is.

I view all of cases listed as being application-specific cases
where the general purposes matching rules (and possible the
directory string syntax) simply are inappropriate.  These special
cases would be better handled by application-specific matching
rules.

In some sense, all directory strings are application-specific cases. Indeed, if URLs, email addresses, and filepaths are application specific, one might wonder what a general case for a directory might be, and what if any purpose would be served by applying bidi prohibition to it.

The BIDI restrictions are appropriate for most (if not all of)
the standard track attribute types using these matching rules,
especially in light of various security considerations (such
as those elaborated on in 9.1 of [Stringprep]).

Looking through those of the standard track attributes, I actually
see very few for which the bidi prohibition rule would be appropriate.
The rule is inapplicable to attributes which numeric or restricted to IA5,
and as you point out, distinguishedName already has a specific matching
rule.


Postal Addresses in countries which use right-to-left scripts are likely
to contain right-to-left characters and start or end with a number.
Such addresses would fail the bidi prohibition rule.

Names are more likely to be single-directional, but some people
do like to include both right-to-left and left-to-right nicknames
in their common names. Such entries would fail the bidi prohibition
rule, and I cannot see a compelling reason to tell my colleagues to
not do this.

Descriptive text (such as, in the case of a directory important
to me, the titles of development-assistance projects) is quite
likely to contain text with both directionalities.

Other attributes, such as country, location, and generationalQualifier
do not appear to be likely targets for bidi spoofing attacks.

Of course, I freely admit that I am not a recognized expert in any
technology; merely someone trying to implement LDAP directories
which might be useful to an international organization working
in countries with a variety of scripts. Perhaps I am missing
something obvious. But I feel it would be a shame if LDAPbis, which
contains many useful and well-articulated ideas, were to render
itself less useful in a multilingual world as a result of the
inappropriate application of bidi prohibition.

Rici