[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#5408) back-ldif bugs

h.b.furuseth@usit.uio.no wrote:
> Full_Name: Hallvard B Furuseth
> Version: HEAD, RE23, RE24
> OS: Linux
> URL: http://folk.uio.no/hbf/OpenLDAP/back-ldif.c
> Submission from: (NULL) (
> Submitted by: hallvard
> Here are a bunch of back-ldif bugs some questions.  I have code for most
> of it, but need advice/discussion on functionality changes.
> I've been sitting on it too long and would keep sitting if I waited
> until I'd cleaned up everything, so, posting now instead.  I enclose an
> URL for a *draft* back-ldif/ldif.c.
> Functionality changes.
> * Some LDIF files/directories with special chars need to be renamed:
>    RE24 or RE25 change?  If anyone uses back-ldif as a regular database,
>    they may need to slapcat/slapadd to upgrade past these changes.
>    - back-ldif escapes the directory separator as \<hex value>.  But
>      Windows uses "\" as directory separator, so that doesn't help.  It
>      needs another escape character.
>      To avoid double hex-escaping of DNs that contain "\", we can pick
>      another escape char, e.g. "^", escape that too, and translate "\" to
>      it.  Thus DN "cn=x^y\2Cz" gets filename "cn=x^5Ey^2Cz.ldif".

That sounds fine. Is there any reason not to use %, which is already used for 
URL encoding?

>    - More characters should likely be hex-escaped:
>      + ":" (as in "C:\") and "/" on Windows?  I've seen programs use the
>        latter as directory separator even on Windows.  Others?
>      + 8-bit chars in case the OS gets clever about charset handling.
>      + Control chars.
>      I don't use Windows myself though.  Nor Mac...

Again, the rules for URL encoding probably already cover any cases we should 
worry about.
>    - When back-ldif uses OS-specific escaping, we can't move a directory
>      tree between Windows and Unix hosts if some RDN in the tree contains
>      characters with OS-specific escaping.
>      Should we special-case both Unix' and Windows' directory separators
>      on both OSes?  Then instead there will be more directory trees which
>      can't be move between OpenLDAP versions, before and after the
>      change.  Assuming anyone uses back-ldif in the first place:-)

Might as well break it all at once.
>    - The database suffix must be hex-escaped like the rest of the
>      filenames.
>    - RDNs ending with ".ldif" should be escaped.  Currently "cn=foo.ldif"
>      is the name of both RDN "cn=foo"'s entry file and RDN "cn=foo.ldif"'s
>      non-leaf directory.

Perhaps we should escape '.'
>    - Sorted-value RDNs:
>      + In dn2path() when IX_FSL != IX_DNL (filename '{' != RDN '{'):
>        IX_FS[LR] must be hex-escaped.  They are treated as equivalent to
>        '{' - '}', thus different RDNs can map to the same filename.
>        The "{1}" in path "/foo/cn=x{,cn={1}y" is not recognized.
>        The IX_DN[LR] (i.e. '{', '}') to IX_FS[LR] translation is a bit
>        strange: It translates the first '{' it sees, then the next
>        '}', then the next '{', etc.  Any reason not to just translate
>        all '{' and '}' chars?  If so, we should at least reset at
>        each filename and '+'.
>      + Sorting (in ldif_r_enum_tree()):
>        Should "a={1}x" sort before or after "a=x"?  Currently it sorts
>        like "a={".  "a=" (normally before) or "a=<CHAR_MAX>" (normally
>        after) would be better.
>        RDN "attr=foo{bar}baz" is treated as a sorted value.  Can easily
>        check for '={'<successful strtol parse'>  '}' instead.
>        "attr={0<octal>}val" and "attr={0x<hex>}val" are recognized as
>        sorted values, should they be?
>      Some of this is from ITS#4627 (back-ldif issues).

   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/