[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: is_utf8 function for ldapsearch w/ UTF-8 strings



The preferred method for contributions is thru the ITS
<http://www.openldap.org/its/>; guildelines for potential contributors are
at <http://www.openldap.org/devel/contributing.html>.  Contributions sent
to a mailing list are not usually tracked.

p.

> Hey,
>
> I would like to contribute the below function in the public
> domain. This could be used by ldapsearch and friends as an alternative
> to ldif_is_not_printable to determine if an attribute value is a string
> or if it should be represented in base64 (currently non-ASCII strings
> are simply base64 encoded).
>
> Mike
>
> int
> is_utf8(const unsigned char *src, int n)
> {
>     const unsigned char *slim = src + n;
>
>     while (src < slim) {
>         int wc;
>
>         if (*src < 0x80) {
>             wc = *src;
>             src++;
>         } else if ((*src & 0xE0) == 0xC0) {
>             if ((slim - src) < 2) return 0;
>             wc = (*src++ & 0x1F) << 6;
>             if ((*src & 0xC0) != 0x80) {
>                 return 0;
>             } else {
>                 wc |= *src & 0x3F;
>             }
>             if (wc < 0x80) {
>                 return 0;
>             }
>             src++;
>         } else if ((*src & 0xF0) == 0xE0) {
>             /* less common */
>             if ((slim - src) < 3) return 0;
>             wc = (*src++ & 0x0F) << 12;
>             if ((*src & 0xC0) != 0x80) {
>                 return 0;
>             } else {
>                 wc |= (*src++ & 0x3F) << 6;
>                 if ((*src & 0xC0) != 0x80) {
>                     return 0;
>                 } else {
>                     wc |= *src & 0x3F;
>                 }
>             }
>             if (wc < 0x800) {
>                 return 0;
>             }
>             src++;
>         } else {
>             /* very unlikely */
>             return 0;
>         }
>     }
>
>     /* it's UTF-8 */
>     return 1;
> }
>




Ing. Pierangelo Masarati
OpenLDAP Core Team

SysNet s.n.c.
Via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
------------------------------------------
Office:   +39.02.23998309
Mobile:   +39.333.4963172
Email:    pierangelo.masarati@sys-net.it
------------------------------------------