[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#5408) back-ldif bugs



Full_Name: Hallvard B Furuseth
Version: HEAD, RE23, RE24
OS: Linux
URL: http://folk.uio.no/hbf/OpenLDAP/back-ldif.c
Submission from: (NULL) (129.240.6.233)
Submitted by: hallvard


Here are a bunch of back-ldif bugs some questions.  I have code for most
of it, but need advice/discussion on functionality changes.

I've been sitting on it too long and would keep sitting if I waited
until I'd cleaned up everything, so, posting now instead.  I enclose an
URL for a *draft* back-ldif/ldif.c.


Functionality changes.

* Some LDIF files/directories with special chars need to be renamed:

  RE24 or RE25 change?  If anyone uses back-ldif as a regular database,
  they may need to slapcat/slapadd to upgrade past these changes.

  - back-ldif escapes the directory separator as \<hex value>.  But
    Windows uses "\" as directory separator, so that doesn't help.  It
    needs another escape character.

    To avoid double hex-escaping of DNs that contain "\", we can pick
    another escape char, e.g. "^", escape that too, and translate "\" to
    it.  Thus DN "cn=x^y\2Cz" gets filename "cn=x^5Ey^2Cz.ldif".

  - More characters should likely be hex-escaped:
    + ":" (as in "C:\") and "/" on Windows?  I've seen programs use the
      latter as directory separator even on Windows.  Others?
    + 8-bit chars in case the OS gets clever about charset handling.
    + Control chars.

    I don't use Windows myself though.  Nor Mac...

  - When back-ldif uses OS-specific escaping, we can't move a directory
    tree between Windows and Unix hosts if some RDN in the tree contains
    characters with OS-specific escaping.

    Should we special-case both Unix' and Windows' directory separators
    on both OSes?  Then instead there will be more directory trees which
    can't be move between OpenLDAP versions, before and after the
    change.  Assuming anyone uses back-ldif in the first place:-)

  - The database suffix must be hex-escaped like the rest of the
    filenames.

  - RDNs ending with ".ldif" should be escaped.  Currently "cn=foo.ldif"
    is the name of both RDN "cn=foo"'s entry file and RDN "cn=foo.ldif"'s
    non-leaf directory.

  - Sorted-value RDNs:

    + In dn2path() when IX_FSL != IX_DNL (filename '{' != RDN '{'):
      IX_FS[LR] must be hex-escaped.  They are treated as equivalent to
      '{' - '}', thus different RDNs can map to the same filename.

      The "{1}" in path "/foo/cn=x{,cn={1}y" is not recognized.
      The IX_DN[LR] (i.e. '{', '}') to IX_FS[LR] translation is a bit
      strange: It translates the first '{' it sees, then the next
      '}', then the next '{', etc.  Any reason not to just translate
      all '{' and '}' chars?  If so, we should at least reset at
      each filename and '+'.

    + Sorting (in ldif_r_enum_tree()):

      Should "a={1}x" sort before or after "a=x"?  Currently it sorts
      like "a={".  "a=" (normally before) or "a=<CHAR_MAX>" (normally
      after) would be better.

      RDN "attr=foo{bar}baz" is treated as a sorted value.  Can easily
      check for '={' <successful strtol parse'> '}' instead.
      "attr={0<octal>}val" and "attr={0x<hex>}val" are recognized as
      sorted values, should they be?

    Some of this is from ITS#4627 (back-ldif issues).

* If slapcat encounters an error in some entry, it outputs a partial
  tree and returns success - no indication that the tree is wrong.

  It looks deliberate: ldif_tool_entry_first() explicitly ignores the
  error code.  Looks to me like it should at least cause be_entry_get()
  to fail.

  Maybe it should also output an empty tree at failure.  (entry_first()
  reads all the entries before it returns anything.)  Or if the intent
  is to output as much as possible in case of failure (at least with
  slapcat -c), it should not give up at the first error it encounters.

* Referral handling is broken:

  - Apparently manageDSAit should only apply to the operation's base
    object: Return a referral if the base is missing and a parent is
    a referral object.  That matches back-bdb, but breaks RFC 3296.

  - ldif_back_referrals() should return non-success for more internal
    errors, like BDB.

  - Search referrals (in ldif_back_referrals()) and continuation
    references (in ldif_back_search() get the wrong scope.
    The former should use op->ors_scope, the latter should use
    the same scope as when it looked up the entry.  (It has already
    changed LDAP_SCOPE_ONELEVEL to LDAP_SCOPE_BASE.)

  - Modify DN should catch newSuperior=<referral object>.

  - The default_referral code can be deleted, see ITS#5339.

* Multiple suffixes break back-ldif.

  Simplest fix is to forbid multiple suffixes: Set be_flag
  SLAP_DBFLAG_ONE_SUFFIX in bi_db_init(), if I understand correctly.

* olcDbDirectory should be 'SINGLE-VALUE'd in the back-ldif schema.

* Internal errors should return LDAP result code 'other', not
  unwillingToPerform/busy.

  Unless there is some reason it returned those, like back-config
  expects unwillingToPerform?  I didn't see any.

* Check for Abandon.

  At least in Search, which rfc4511 requires to check for Abandon.  As
  far as I can tell we just need to sprinkle be->be_<operation>() with
    "if (op->o_abandon) return SLAPD_ABANDON;"
  and the caller will take care of the rest.

* ITS#5329: ACL deadlock.  Should the ACL check be there?  More checks?

========================================================================

Other bugs for which the fixes should not break anything:

* Modify DN with newSuperior does not create the new parent directory.

* Search should send noSuchObject only when it is the baseobject which
  does not exist.  (ldif_r_enum_tree() checks the scope to see if it
  .is at the baseobject, that doesn't work.)

* ldapadd <suffix entry> creates <database directory>
  if file <database directory>.ldif exists.

* tmpfile-names can in theory match a DN/RDN (and thus a subtree's
  directory name).  A simple fix is to include ",," in tempfile-names.

* slurp_file() buffer overrun if file size > (int)fstat().st_size.
  (Size > INT_MAX, file grew after fstat(), filetype where st_size=0.)

Since I was changing so much anyway, I did some other changes:

* Better result code/error handling, improve/normalize Debug output.
  - Handle errors from most system calls.  modrdn in particular was bad.
  - Drop the one use of op->o_log_prefix in Debug(), it seems out of
    place.
  - Return LDAP result codes from more functions.
  - spew_entry():
    + Remove *save_errnop arg, return/print error instead.
    + Wrong save_errno setting after close().
    + Wrong spew_entry() error-return check in ldif_back_modify().
  - move_entry() checked incorrectly for open()/spew_entry() errors.
  - Remove the ENOENT special case in spew_entry().  Should only happen
    if someone is messing with the database contents by hand, or due to
    bugs I've fixed (like in move_entry).

* Move code around, in particular:
  - Move file operations around to help error checking.
  - Do all reads before writes in LDAP update ops, to narrow down locks.
  - Move common add/modify/modrdn code to ldif_prepare_create().
  - Make directories and files in a single function ldif_spew_entry().
  - Reuse computed values more.

* Change some integer types to what they are used for.

* Cleanup:
  - Rename some functions by prepending "ldif_" for better debug output,
    and get_entry/get_entry_for_fd() -> ldif_read_entry/read_entry_file
    so those names differ more from ldif_back_entry_get().
  - Add comments.
  - Reformat somewhat.  Function headers, too long lines, spacing, etc.
  - Remove 'if something != NULL' tests where that's already guaranteed.
  - Reduce switch statements - remove no-op default/break, etc.
  - Rename some variables. dn=>ndn for normalized DN, rs=>rc for result
    code, etc.  Use macros like ors_scope for oq_search.rs_scope.