Full_Name: Daniel Chabrol Version: 2.0.7 OS: Linux (2.2.16) URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (195.226.109.184) Hello! I upgraded from 2.0.6 to 2.0.7. But now I'm unable to import my old LDIF-Exports. I receive the following error in the syslog: mail slapd[8264]: conn=1 op=10 RESULT tag=105 err=21 text=value contains invalid data After some tests I noted that this is caused by "special" characters like the german umlauts (������) in cn and sn attributes (objectclass person). If I use the same configuration like with 2.0.6 it still doesn't work. What's the difference? And how can I fix this problem to be able to import my old directory-data? Daniel
chabrol@webonomics.de wrote: > What's the difference? Code is becoming more and more RFC-compliant, but in some cases that compliance affects uses that were common in the past, but always were dubious if not plain wrong. I think you used to have your special characters coded as ISO 8859-1. You cannot do that, they have to be in UTF-8 (an encoding of Unicode and ISO-10646). > And how can I fix this problem to be able to import my old > directory-data? Sorry, I think you will have to translate them. And your old applications will have to be reconfigured for UTF-8. Sucks, I know, but OpenLDAP 2.x follows LDAPv3 and that mandates UTF-8. Julio
Hello! > I think you used to have your special characters coded as ISO 8859-1. > You cannot do that, they have to be in UTF-8 (an encoding of Unicode > and ISO-10646). The umlauts are normal ISO 8859-1 (ISO Latin 1) characters. > > And how can I fix this problem to be able to import my old > > directory-data? > Sorry, I think you will have to translate them. And your old > applications will have to be reconfigured for UTF-8. attributetype ( 2.5.4.3 NAME ( 'cn' 'commonName' ) SUP name ) attributetype ( 2.5.4.4 NAME ( 'sn' 'surname' ) SUP name ) attributetype ( 2.5.4.41 NAME 'name' EQUALITY caseIgnoreMatch SUBSTR caseIgnoreSubstringsMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.15{32768} ) This Syntax is according to the documentation an UTF-8 string. So it looks like I've to convert it. But there is too much data for a manual conversion. Is there a tool available to convert LDIF-Data from ISO-LATIN-1 to UTF-8? And there is an additional Problem: The attributes containing ISO-LATIN-1 characers are "binary" encoded in the LDIF-Data (for example cn:: S2xhdXMgVHL2bmRsZQ==). This data should be correctly converted to UTF-8. But the attribute userpassword looks like it is an normal octet string: attributetype ( 2.5.4.35 NAME 'userPassword' EQUALITY octetStringMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.40{128} ) I suppose if this data is also converted to UTF-8, this will break the userauthentication because. So the tool should leave the userpassword-attribute untouched. Has somebody a solution or tip? Daniel
chabrol@webonomics.de wrote: Hi, > This Syntax is according to the documentation an UTF-8 string. So it looks > like I've to convert it. But there is too much data for a manual > conversion. Is there a tool available to convert LDIF-Data from > ISO-LATIN-1 to UTF-8? Try the 'iconv' utility: it supports conversion between many character sets, including of course ISO-LATIN-1 and UTF-8. > And there is an additional Problem: The attributes > containing ISO-LATIN-1 characers are "binary" encoded in the LDIF-Data > (for example cn:: S2xhdXMgVHL2bmRsZQ==). This data should be correctly > converted to UTF-8. But the attribute userpassword looks like it is an > normal octet string: > > attributetype ( 2.5.4.35 NAME 'userPassword' > EQUALITY octetStringMatch > SYNTAX 1.3.6.1.4.1.1466.115.121.1.40{128} ) > > I suppose if this data is also converted to UTF-8, this will break the > userauthentication because. So the tool should leave the > userpassword-attribute untouched. Has somebody a solution or tip? It should work since such characters are common to ISO-LATIN-1 and UTF-8 (US-ASCII range). Jean-Philippe. -- | Jean-Philippe BRUNON - AURORA
Daniel Chabrol wrote: > This Syntax is according to the documentation an UTF-8 string. So it looks > like I've to convert it. But there is too much data for a manual > conversion. Is there a tool available to convert LDIF-Data from > ISO-LATIN-1 to UTF-8? Not that I know. I have a Perl script that does something related, it takes an LDIF file and outputs the suffix, translating both the entry DN as well as DN-valued attributes. A very crude adaptation to you problem is: #!/usr/bin/perl use strict; use diagnostics; use Net::LDAP::LDIF; use Unicode::String qw(latin1 utf8); my $debug = 0; my %ds_syntax; { # List here attribute types with Directory String syntax my @ds_syntax = qw(cn sn o ou etc...); %ds_syntax = map { $_ => 1 } @ds_syntax; } my $o_ldif = Net::LDAP::LDIF->new( "tocho", "r" ); my $n_ldif = Net::LDAP::LDIF->new( "tocho.new", "w" ); while( my $entry = $o_ldif->read() ) { # Do things with $entry my $dn = $entry->dn; print "Procesando $dn\n" if $debug; $entry->dn(latin1($dn)->utf8); for my $at ($entry->attributes) { # print "Analizando $at\n" if $debug; if ($ds_syntax{lc($at)}) { my $vals = $entry->get($at); for my $i (0..$#$vals) { print "Replacing $$vals[$i] by " if $debug; $$vals[$i] = latin1($$vals[$i])->utf8; print "$$vals[$i]\n" if $debug; } $entry->replace($at, $vals); } } $n_ldif->write($entry); } $o_ldif->done(); $n_ldif->done(); exit 0; I have not even checked for syntax errors above, I just provide it to show the basic technique, but be careful. Sorry for the English/Spanish mix, I happen to write like that a lot. > And there is an additional Problem: The attributes > containing ISO-LATIN-1 characers are "binary" encoded in the LDIF-Data > (for example cn:: S2xhdXMgVHL2bmRsZQ==). This data should be correctly > converted to UTF-8. No problem, Net::LDAP::LDIF should do it right. > But the attribute userpassword looks like it is an > normal octet string: > > attributetype ( 2.5.4.35 NAME 'userPassword' > EQUALITY octetStringMatch > SYNTAX 1.3.6.1.4.1.1466.115.121.1.40{128} ) > > I suppose if this data is also converted to UTF-8, this will break the > userauthentication because. So the tool should leave the > userpassword-attribute untouched. Has somebody a solution or tip? Have a list of translatable attribute types as I did above. Hope this helps, Julio
Hi! > Try the 'iconv' utility: it supports conversion between many > character sets, including of course ISO-LATIN-1 and UTF-8. Iconv would work if I find a tool which converts "binary" LDIF-Encoding (like cn:: S2xhdXMgVHL2bmRsZQ) into normal characters. Is there such a tool available? If yes I can convert this "special encoding" to normal ISO-LATIN-1 characters and after that with iconv to UTF-8 Daniel
Hello! > Not that I know. I have a Perl script that does something related, it > takes an LDIF file and outputs the suffix, translating both the > entry DN as well as DN-valued attributes. With your Perl-script I was able to convert some attributes into UTF-8. The import worked perfectly. Thank you for your help! Daniel
Daniel Chabrol wrote: > > With your Perl-script I was able to convert some attributes into UTF-8. The > import worked perfectly. Thank you for your help! Glad to know, it may not work for everyone depending on the version of perl-ldap, the get/get_value methods in Net::LDAP::Entry have varying availability/semantics... Terrific modules anyway. All praise Graham Barr... Julio
changed notes changed state Open to Closed
discussion