[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: slapcat generate extra "space" characters in LDIF output



Mark J. Reed writes:
> FWIW, I use this script to undo the line-folding of LDIF.  Just pipe
> the output of slapcat or ldapsearch or whatever through it and you'll
> get something that's no longer RFC-compliant LDIF but is more amenable
> to processing with text-based tools.
> (..snip..)

This one-liner is sufficient to undo the line folding:

      perl -p00e 's/\r?\n //g'

For people _reading_ an LDIF, remember you may need further decoding:
The LDIF may contain "attrtype:: base64-encoded value" or
"attrtype:< URL of value".

> Refolding is a one-liner, something like this:
> 
>     perl -pe 's/^(.{76})(..*)$/$1\n $2/'

The LDIF RFC doesn't require that, but it doens't hurt either.

However your snippet fails if you need to fold into more than 2 lines.
Also you should avoid folding in the middle of an UTF-8 sequence.

#!/usr/bin/perl -wp
# Fold LDIF lines > 76 chars (Note, not an LDIF requirement)

# Match {71 to 76} chars 1st time, then {70 to 75} since we prepended a space
# Accept either CRLF or LF as line endings.  But always folds with just LF.
# Avoid splitting in the middle of an UTF-8 char, i.e. before [\200-\277].
s/\G ((?: \A[^\r\n] | (?!\A)) [^\r\n]{70,75}) (?=[^\r\n\200-\277]) /$1\n /gx;

-- 
Hallvard