[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: Grammar nits (Re: [Fwd: I-D ACTION:draft-good-ldap-ldif-04.txt])
Harald Tveit Alvestrand writes:
>>
>>How do we say "any character except NUL, CR or LF" in ABNF when we don't
>>know the max integer code of a character in the parser's characer set?
>>Assume iso10646 and say something like `%x01-09/%x0B-0C/%x0E-7FFFFFFF'?
>
> RFC 2234 is quiet here:
Meaning "we can't"?
> The ACAP specs have chosen to represent their grammar as a grammar of
> octets, meaning that the correct "high value" is 255, or 0xFF.
...or it means "an ACAP grammar describes the file in terms of octets"?
How nice for hosts with 9-bit or 16-bit bytes:-)
> This actually brings out an important question:
> What's the character set of an LDIF file?
> Note 8 to the grammar seems to assume that the character set is UTF-8,
Note 8 says the input file's encoding can be anything, but the generated
LDIF content (the output) must be UTF-8.
I think the input file must be converted to UTF-8 _before_ it is fed to
the grammar, since the grammar describes LDAP strings (= UTF-8 strings).
Maybe the draft should say so.
However, I'm not sure that answer your question even if the file is
UTF-8. Is a `character' an octet, a sequence of UTF-8 octets which
encodes an iso-10646 character, or that iso-10646 character?
If it is not the former, we can't fold a line in the middle of an
multi-octet encoded iso10646 character.
> and the changelog says this is "clarified", but I can't find the
> clarification....
I guess note 8 was added in version -01, which is where the changelog
says it was clarified.
> Here are the REAL grammar nits from verson 04:
>
> - missing endquote for "control"
Yup. See my list of grammar bugs.
> - extra space in front of repetition in second line of same definiton
Didn't catch that one.
> DIGIT: Used 1 times, but not defined
> base-64-dn: Used 1 times, but not defined
> base-64-rdn: Used 1 times, but not defined
> base64-rdn: Defined but not used
Yup.
> NUL: Defined but not used
Well, it's used in the description of <safe>.
--
Hallvard