[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: non-unix help needed: valid filename characters?

Howard Chu writes:
>Hallvard B Furuseth wrote:
>> Might as well hex-escape (...)
>> (And '/', 8-bit and control chars, and a special hack for windows
>> according to your latest message on the ITS.)
> Honestly, I wouldn't make any special effort for control
> characters. All the filesystems we touch accept them.

Well - I think it'd be nice to be able to visit back-ldif filenames
without troble.  Still, it's just back-ldif, so as long as it works, it

> 8-bit I'm not so sure about, some filesystems expect UTF-8, while
> others are 8-bit clean to begin with. If we can expect that no one is
> going to use an octetString as a naming attribute, I think we'll be
> fine since everything else will be in UTF-8 anyway.

I've got three or so conflicting opinions on that myself.  On an UTF-8
filesystem, that'll certainly be nice for people using non-Latin
characters.  Assuming they decide to use back-ldif in the first place,
of course.  I suppose it'll happen with private schema include files in
cn=config, at least.  And maybe module names.

So, unless someone else has strong opinions, I guess they'll stay
untouched.  And then there's EBCDIC, where uppercase A-Z chars are
apparently 8-bit.  Though an ('A' == 65) test could catch that.

I didn't believe "everything will soon be UTF-8" a decade ago and I
still don't.

8-bit filenames and data worked just fine for a while until various
applications discovered "internationalization".  That can ruin an
otherwise 8-bit clean system - the OS handles it fine, but not the
apps.  The Windows troubles in the URL you posted seem just the same.

For example, Emacs in its default mode on my system seems unable to
visit 8-bit filenames from its own Dired buffer.  Presumably something
in it disagrees with itself about the encoding of filenames.  I'm sure
there is something I could configure - until I run into the next equally
clever program.  So personally I just stick to 7-bit filenames.