[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: OpenLDAP 2.0.6 -> 2.0.7 problem (ITS#890)




Daniel Chabrol wrote:

> This Syntax is according to the documentation an UTF-8 string. So it looks
> like I've to convert it. But there is too much data for a manual
> conversion. Is there a tool available to convert LDIF-Data from
> ISO-LATIN-1 to UTF-8?

Not that I know. I have a Perl script that does something related, it
takes an LDIF file and outputs the suffix, translating both the
entry DN as well as DN-valued attributes.  A very crude adaptation
to you problem is:

#!/usr/bin/perl

use strict;
use diagnostics;

use Net::LDAP::LDIF;
use Unicode::String qw(latin1 utf8);

my $debug = 0;

my %ds_syntax;

{
    # List here attribute types with Directory String syntax
    my @ds_syntax = qw(cn sn o ou etc...);

    %ds_syntax = map { $_ => 1 } @ds_syntax;
}

my $o_ldif = Net::LDAP::LDIF->new( "tocho", "r" );
my $n_ldif = Net::LDAP::LDIF->new( "tocho.new", "w" );

while( my $entry = $o_ldif->read() ) {
    # Do things with $entry
    my $dn = $entry->dn;
    print "Procesando $dn\n" if $debug;
    $entry->dn(latin1($dn)->utf8);
    for my $at ($entry->attributes) {
        # print "Analizando $at\n" if $debug;
        if ($ds_syntax{lc($at)}) {
            my $vals = $entry->get($at);
            for my $i (0..$#$vals) {
                print "Replacing $$vals[$i] by " if $debug;
                $$vals[$i] = latin1($$vals[$i])->utf8;
                print "$$vals[$i]\n" if $debug;
            }
            $entry->replace($at, $vals);
        }
    }

    $n_ldif->write($entry);
}
$o_ldif->done();
$n_ldif->done();
exit 0;

I have not even checked for syntax errors above, I just provide
it to show the basic technique, but be careful.  Sorry for the
English/Spanish mix, I happen to write like that a lot.

> And there is an additional Problem: The attributes
> containing ISO-LATIN-1 characers are "binary" encoded in the LDIF-Data
> (for example cn:: S2xhdXMgVHL2bmRsZQ==). This data should be correctly
> converted to UTF-8.

No problem, Net::LDAP::LDIF should do it right.

> But the attribute userpassword looks like it is an
> normal octet string:
> 
> attributetype ( 2.5.4.35 NAME 'userPassword'
>         EQUALITY octetStringMatch
>         SYNTAX 1.3.6.1.4.1.1466.115.121.1.40{128} )
> 
> I suppose if this data is also converted to UTF-8, this will break the
> userauthentication because. So the tool should leave the
> userpassword-attribute untouched. Has somebody a solution or tip?

Have a list of translatable attribute types as I did above.

Hope this helps,

Julio