[Date Prev][Date Next] [Chronological] [Thread] [Top]

Perl snippet for ISO-8559-1 to UTF-8 and Base64 encoding



For those who like Perl, here's a revised snippet of code (a subroutine)
which will take as parameter the LDAP attribute name (e.g., CN) and 
the ISO-8559-1 Latin1 encoded string, and will return the string
with LDIF syntax, using Base64 and UTF-8 (Unicode) encoding in case
there are characters with the 8th bit set, i.e., non-ASCII.

This snippet is clearly incomplete, i.e., it doesn't yet handle cases
where there are trailing spaces (which require Base64 encoding), nor 
does it attempt to break long lines as per the LDIF spec, but it should
work for most cases.  Enjoy!

use MIME::Base64;
use Unicode::String qw(utf8 latin1 utf16);

sub mime_encode {

# There are two parameters.  The first is the attribute name,
# without the colon,  while the second is the value, e.g., 
# "cn" and "Paul Gillingwater." We return the concatenated string.

# This routine will check the value for non-ASCII characters 
# (which we assume to have a value in the range 0x7f..0xff
# since we assume the input is single-byte encoded not Unicode.

# If we match, then we must return the string as MIME encoded
# with Base64, otherwise we return it untouched.  Note that
# when we do MIME encoding, we need to add a second colon 
# into the string.  We also need to convert any ISO-8559
# characters from Latin1 to UTF-8, for which we use the
# Unicode module

# Note the encoding adds a newline, which we remove with chop

  my ($att,$val) = @_;
  if ($val =~ /[\x80-\xFF]/) {
    my $u = Unicode::String::latin1($val);  # convert from ISO8559-1 to UTF-8
    $val = encode_base64($u->utf8);
    chop($val);  # remove the newline added by the encode_base64
    # we use the double colon to indicate MIME encoding
    return $att . ":: " . $val;
  } else {
    return $att . ": " . $val;
  }
}

If your Perl doesn't contain those modules, you can install them (as root) from 
CPAN using:

perl -MCPAN -e shell
$ install MIME::Base64
$ install Unicode::String 

*********************************
        Paul Gillingwater
        Managing Director
 CSO Lanifex Unternehmensberatung 
 & Softwareentwicklung G.m.b.H.
      NEW BUSINESS CONCEPTS

E-mail:  paul@lanifex.com
Mobile:  +43/699/1922 3085
Webhome: http://www.lanifex.com
Address: Praterstrasse 60/1/2 
         A-1020 Vienna, Austria
*********************************