OpenLDAP
Up to top level
Build   Contrib   Development   Documentation   Historical   Incoming   Software Bugs   Software Enhancements   Web  

Logged in as guest

Viewing Software Enhancements/4604
Full headers

From: reiffert@student.physik.uni-mainz.de
Subject: backs-sql: backsql_process_filter(): invalid filter for UTF-8 querys
Compose comment
Download message
State:
0 replies:
9 followups: 1 2 3 4 5 6 7 8 9

Major security issue: yes  no

Notes:

Notification:


Date: Sat, 1 Jul 2006 08:07:01 GMT
From: reiffert@student.physik.uni-mainz.de
To: openldap-its@OpenLDAP.org
Subject: backs-sql: backsql_process_filter(): invalid filter for UTF-8 querys
Full_Name: Thomas Reifferscheid
Version: 2.3.24
OS: Linux/2.6.16
URL: http://student.physik.uni-mainz.de/~reiffert/back_sql.txt
Submission from: (NULL) (84.168.189.212)


I'm using Thunderbird as slapd frontend.
When searching for an recipient whose name contains special chars
like "..." (german umlauts), the slapd's syslog correctly shows the utf-8
encoded
characters.

slapd then hands over the filter to back-sql which then returns
"invalid filter".


When searching for "M*ller", so replacing "." by "*", the answers
that get back to thunderbird, correctly show "M.ller".


A loglevel 65535 output from a "m.l" search can be found
at the provided URL.

Kind regards
Thomas Reifferscheid


Followup 1

Download message
Date: Sat, 1 Jul 2006 10:17:39 +0200 (CEST)
From: Thomas Reifferscheid <reiffert@Student.Physik.Uni-Mainz.DE>
To: openldap-its@OpenLDAP.org
Subject: Re: (ITS#4604) backs-sql: backsql_process_filter(): invalid filter
 for UTF-8 querys 
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--1569075624-1104570904-1151741859=:365
Content-Type: TEXT/PLAIN; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE

Additional Information:

The ITS-Webinterface replaces the mentioned special german chars by
"."

You can find examples here:
http://de.wikipedia.org/wiki/Umlaut#Umlautbuchstaben


(=F6=E4=FC=D6=C4=DC=DF) -> (.......)


So when reading the original issue, please keep in mind,
that you have to replace "." by one of those chars.



Hope that helps
Thomas Reifferscheid
--1569075624-1104570904-1151741859=:365--



Followup 2

Download message
Subject: Re: (ITS#4604) backs-sql: backsql_process_filter(): invalid filter
	for UTF-8 querys
From: Pierangelo Masarati <ando@sys-net.it>
To: reiffert@student.physik.uni-mainz.de
Cc: openldap-its@OpenLDAP.org
Date: Sat, 01 Jul 2006 16:02:13 +0200
On Sat, 2006-07-01 at 08:07 +0000, reiffert@student.physik.uni-mainz.de
wrote:
> Full_Name: Thomas Reifferscheid
> Version: 2.3.24
> OS: Linux/2.6.16
> URL: http://student.physik.uni-mainz.de/~reiffert/back_sql.txt
> Submission from: (NULL) (84.168.189.212)
> 
> 
> I'm using Thunderbird as slapd frontend.
> When searching for an recipient whose name contains special chars
> like "......" (german umlauts), the slapd's syslog correctly shows the
utf-8
> encoded
> characters.
> 
> slapd then hands over the filter to back-sql which then returns
> "invalid filter".
> 
> 
> When searching for "M*ller", so replacing ".." by "*", the answers
> that get back to thunderbird, correctly show "M..ller".
> 
> 
> A loglevel 65535 output from a "m..l" search can be found
> at the provided URL.

back-sql acts as a "dumb" backend to a generic RDBMS; LDAP requires UTF8
for directory strings, while the RDBMS uses whatever instructed to.  You
need to make sure appropriate conversion occurs in between.  Back-sql
does not provide any means to convert strings to/from another character
set.  A patch that consistently enables translitteration from UTF8 to
whatever charset is used in the RDBMS for a given data type is welcome,
provided it complies with the license and other contribution criteria of
OpenLDAP software.

p.




Ing. Pierangelo Masarati
Responsabile Open Solution
OpenLDAP Core Team

SysNet s.n.c.
Via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
------------------------------------------
Office:   +39.02.23998309          
Mobile:   +39.333.4963172
Email:    pierangelo.masarati@sys-net.it
------------------------------------------



Followup 3

Download message
Date: Sat, 1 Jul 2006 16:47:07 +0200 (CEST)
From: Thomas Reifferscheid <reiffert@Student.Physik.Uni-Mainz.DE>
To: Pierangelo Masarati <ando@sys-net.it>
cc: openldap-its@OpenLDAP.org
Subject: Re: (ITS#4604) backs-sql: backsql_process_filter(): invalid filter
 for UTF-8 querys
Dear Pieranelo,

regarding http://student.physik.uni-mainz.de/~reiffert/back_sql.txt
...


On Sat, 1 Jul 2006, Pierangelo Masarati wrote:

> back-sql acts as a "dumb" backend to a generic RDBMS; LDAP requires UTF8
> for directory strings,


Jul  1 09:51:10 hell slapd[25812]:     filter: 
(|(cn=m\C3\BCl*)(?=undefined)(sn=m\C3\BCl*))


isn't that in the exact right way it has to be?

If not, how has it to look like at this place so that

the answer is not:

Jul  1 09:51:10 hell slapd[25812]: backsql_process_filter(): invalid 
filter


> while the RDBMS uses whatever instructed to.  You
> need to make sure appropriate conversion occurs in between.  Back-sql
> does not provide any means to convert strings to/from another character
> set.  A patch that consistently enables translitteration from UTF8 to
> whatever charset is used in the RDBMS for a given data type is welcome,
> provided it complies with the license and other contribution criteria of
> OpenLDAP software.

If I'd understand you better, I probably will start finding a workaround
to enhance openldap getting around that nasty thing!

Kind regards
Thomas Reifferscheid



Followup 4

Download message
Subject: Re: (ITS#4604) backs-sql: backsql_process_filter(): invalid filter
	for UTF-8 querys
From: Pierangelo Masarati <ando@sys-net.it>
To: Thomas Reifferscheid <reiffert@Student.Physik.Uni-Mainz.DE>
Cc: openldap-its@OpenLDAP.org
Date: Sat, 01 Jul 2006 17:06:32 +0200
On Sat, 2006-07-01 at 16:47 +0200, Thomas Reifferscheid wrote:
> Dear Pieranelo,
> 
> regarding http://student.physik.uni-mainz.de/~reiffert/back_sql.txt
> ...
> 
> 
> On Sat, 1 Jul 2006, Pierangelo Masarati wrote:
> 
> > back-sql acts as a "dumb" backend to a generic RDBMS; LDAP requires
UTF8
> > for directory strings,
> 
> 
> Jul  1 09:51:10 hell slapd[25812]:     filter: 
> (|(cn=m\C3\BCl*)(?=undefined)(sn=m\C3\BCl*))

This seems to indicate that the second AVA in your filter is incorrect.
I have no idea as per what Thunderbird sends as search filter, but
likely either that attribute is not defined in your slapd's schema, or
its value does not comply with that attribute's syntax.

I suggest you find out (e.g. by looking at logs with level "packets")
what filter is actually sent; then check if the attribute is defined in
your schema and if the value complies with the syntax of that attribute.

> 
> 
> isn't that in the exact right way it has to be?
> 
> If not, how has it to look like at this place so that
> 
> the answer is not:
> 
> Jul  1 09:51:10 hell slapd[25812]: backsql_process_filter(): invalid 
> filter
> 
> 
> > while the RDBMS uses whatever instructed to.  You
> > need to make sure appropriate conversion occurs in between.  Back-sql
> > does not provide any means to convert strings to/from another
character
> > set.  A patch that consistently enables translitteration from UTF8 to
> > whatever charset is used in the RDBMS for a given data type is
welcome,
> > provided it complies with the license and other contribution criteria
of
> > OpenLDAP software.
> 
> If I'd understand you better, I probably will start finding a workaround
> to enhance openldap getting around that nasty thing!

This remains valid: LDAP strings are UTF8 encoded; data in your RBBMS is
encoded whatever it was set as.  back-sql should be smart enough to
allow mapping, but right now it's not.

p.




Ing. Pierangelo Masarati
Responsabile Open Solution
OpenLDAP Core Team

SysNet s.n.c.
Via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
------------------------------------------
Office:   +39.02.23998309          
Mobile:   +39.333.4963172
Email:    pierangelo.masarati@sys-net.it
------------------------------------------



Followup 5

Download message
Date: Sat, 1 Jul 2006 17:41:14 +0200 (CEST)
From: Thomas Reifferscheid <reiffert@Student.Physik.Uni-Mainz.DE>
To: Pierangelo Masarati <ando@sys-net.it>
cc: openldap-its@OpenLDAP.org
Subject: Re: (ITS#4604) backs-sql: backsql_process_filter(): invalid filter
 for UTF-8 querys
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--1569075624-1215372281-1151766743=:19714
Content-Type: TEXT/PLAIN; CHARSET=ISO-8859-15; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE
Content-ID: <Pine.LNX.4.64.0607011721151.19757@student.physik.uni-mainz.de>



On Sat, 1 Jul 2006, Pierangelo Masarati wrote:

>> Jul  1 09:51:10 hell slapd[25812]:     filter:
>> (|(cn=3Dm\C3\BCl*)(?=3Dundefined)(sn=3Dm\C3\BCl*))
>
> This seems to indicate that the second AVA in your filter is incorrect.
> I have no idea as per what Thunderbird sends as search filter, but
> likely either that attribute is not defined in your slapd's schema, or
> its value does not comply with that attribute's syntax.


Oh, how could I've been so blind. Thanks for the hint. Here is the
result:

working filter, for entering "kepper" into thunderbird:

=3D=3D>backsql_search():=20
base=3D"dc=3Dfoo,dc=3Dde", filter=3D"(|(cn=3Dkepper*)(mail=3Dkepper*)(sn=3D=
kepper*))"


And here is what happens when I enter "M=FCller", the word
with the german special char:

=3D=3D>backsql_search():
base=3D"dc=3Dfoo,dc=3Dde", filter=3D"(|(cn=3Dm\C3\BCller*)(?=3Dundefined)(s=
n=3Dm\C3\BCller*))"


Now let's the why.


starting slapd with -d 255 showed up:

ldap_read: want=3D101, got=3D101
  0000: 11 64 63 3d 64 69 65 66 69 72 6d 61 2c 64 63 3d .dc=3Ddiefirma,dc=
=3D
  0010: 64 65 0a 01 02 0a 01 00 02 02 00 c8 02 01 00 01 de..............
  0020: 01 00 a1 35 a4 0f 04 02 63 6e 30 09 80 07 6d c3 ...5....cn0...m.
  0030: bc 6c 6c 65 72 a4 11 04 04 6d 61 69 6c 30 09 80 .ller....mail0..
  0040: 07 6d c3 bc 6c 6c 65 72 a4 0f 04 02 73 6e 30 09 .m..ller....sn0.
  0050: 80 07 6d c3 bc 6c 6c 65 72 30 0a 04 02 63 6e 04 ..m..ller0...cn.
  0060: 04 6d 61 69 6c                                  .mail
ber_get_next: tag 0x30 len 107 contents:
ber_dump: buf=3D0x08256e08 ptr=3D0x08256e08 end=3D0x08256e73 len=3D107
  0000: 02 01 02 63 66 04 11 64 63 3d 64 69 65 66 69 72 ...cf..dc=3Ddiefir
  0010: 6d 61 2c 64 63 3d 64 65 0a 01 02 0a 01 00 02 02 ma,dc=3Dde........
  0020: 00 c8 02 01 00 01 01 00 a1 35 a4 0f 04 02 63 6e .........5....cn
  0030: 30 09 80 07 6d c3 bc 6c 6c 65 72 a4 11 04 04 6d 0...m..ller....m
  0040: 61 69 6c 30 09 80 07 6d c3 bc 6c 6c 65 72 a4 0f ail0...m..ller..
  0050: 04 02 73 6e 30 09 80 07 6d c3 bc 6c 6c 65 72 30 ..sn0...m..ller0
  0060: 0a 04 02 63 6e 04 04 6d 61 69 6c                ...cn..mail
ber_get_next

So it still looks like everything is fine.
What came to my mind here was, "is the attribute mail allowed
to contain utf-8 chars by the schema?" but I could not solve that
question yet.

Let's see what happens next:

begin get_filter_list
begin get_filter
SUBSTRINGS
begin get_ssa
ber_scanf fmt ({m) ber:
ber_dump: buf=3D0x08256e08 ptr=3D0x08256e32 end=3D0x08256e73 len=3D65
0000:a4 0f 04 02 63 6e 30 0980 07 6d c3 bc 6c 6c 65 ....cn0...m..lle
0010:72 a4 11 04 04 6d 61 696c 30 09 80 07 6d c3 bc r....mail0...m..
0020:6c 6c 65 72 a4 0f 04 0273 6e 30 09 80 07 6d c3 ller....sn0...m.
0030:bc 6c 6c 65 72 30 0a 0402 63 6e 04 04 6d 61 69 .ller0...cn..mai
0040:6c                                             l
ber_scanf fmt (m) ber:
ber_dump: buf=3D0x08256e08 ptr=3D0x08256e3a end=3D0x08256e73 len=3D57
0000:80 07 6d c3 bc 6c 6c 6572 a4 11 04 04 6d 61 69 ..m..ller....mai
0010:6c 30 09 80 07 6d c3 bc6c 6c 65 72 a4 0f 04 02 l0...m..ller....
0020:73 6e 30 09 80 07 6d c3bc 6c 6c 65 72 30 0a 04 sn0...m..ller0..
0030:02 63 6e 04 04 6d 61 696c                      .cn..mail
INITIAL
end get_ssa

Now it comes to scanf:


begin get_ssa
ber_scanf fmt ({m) ber:
ber_dump: buf=3D0x08256e08 ptr=3D0x08256e43 end=3D0x08256e73 len=3D48
0000:00 11 04 04 6d 61 69 6c30 09 80 07 6d c3 bc 6c ....mail0...m..l
0010:6c 65 72 a4 0f 04 02 736e 30 09 80 07 6d c3 bc ler....sn0...m..
0020:6c 6c 65 72 30 0a 04 0263 6e 04 04 6d 61 69 6c ller0...cn..mail
ber_scanf fmt (m) ber:
ber_dump: buf=3D0x08256e08 ptr=3D0x08256e4d end=3D0x08256e73 len=3D38
0000:80 07 6d c3 bc 6c 6c 6572 a4 0f 04 02 73 6e 30 ..m..ller....sn0
0010:09 80 07 6d c3 bc 6c 6c65 72 30 0a 04 02 63 6e ...m..ller0...cn
0020:04 04 6d 61 69 6c                              ..mail
error=3D21
end get_filter 0


error=3D21 shows up.

Finally let's the rest:

begin get_filter
SUBSTRINGS
begin get_ssa
ber_scanf fmt ({m) ber:
ber_dump: buf=3D0x08256e08 ptr=3D0x08256e56 end=3D0x08256e73 len=3D29
0000:00 0f 04 02 73 6e 30 0980 07 6d c3 bc 6c 6c 65 ....sn0...m..lle
0010:72 30 0a 04 02 63 6e 0404 6d 61 69 6c          r0...cn..mail
ber_scanf fmt (m) ber:
ber_dump: buf=3D0x08256e08 ptr=3D0x08256e5e end=3D0x08256e73 len=3D21
0000:80 07 6d c3 bc 6c 6c 6572 30 0a 04 02 63 6e 04 ..m..ller0...cn.
0010:04 6d 61 69 6c                                 .mail
INITIAL
end get_ssa
end get_filter 0
end get_filter_list
end get_filter 0
filter: (|(cn=3Dm\C3\BCller*)(?=3Dundefined)(sn=3Dm\C3\BCller*))




I hope the output will he

Message of length 6043 truncated


Followup 6

Download message
Subject: Re: (ITS#4604) backs-sql: backsql_process_filter(): invalid filter
	for UTF-8 querys
From: Pierangelo Masarati <ando@sys-net.it>
To: Thomas Reifferscheid <reiffert@Student.Physik.Uni-Mainz.DE>
Cc: openldap-its@OpenLDAP.org
Date: Sat, 01 Jul 2006 18:27:28 +0200
On Sat, 2006-07-01 at 17:41 +0200, Thomas Reifferscheid wrote:

> I changed in core.schema
> 
> attributetype ( 0.9.2342.19200300.100.1.3
>          NAME ( 'mail' 'rfc822Mailbox' )
>          DESC 'RFC1274: RFC822 Mailbox'
>      EQUALITY caseIgnoreIA5Match
>      SUBSTR caseIgnoreIA5SubstringsMatch
>      SYNTAX 1.3.6.1.4.1.1466.115.121.1.26{256} )
> 
> 
> to
> 
> 
> attributetype ( 0.9.2342.19200300.100.1.3
>          NAME ( 'mail' 'rfc822Mailbox' )
>          DESC 'RFC1274: RFC822 Mailbox'
>      EQUALITY caseIgnoreMatch
>      SUBSTR caseIgnoreSubstringsMatch
>      SYNTAX 1.3.6.1.4.1.1466.115.121.1.15{32768} )
> 
> 
> and now it works!
> 
> 
> Thanks a lot for the right hint on the subject.
> The issue can be closed.

You shouldn't do that: changing standard track schema is not good
practice.  As far as I remember, RFC 2822 does not allow arbitrary chars
in the local part of an address; they might be allowed in the display
name, but I fear it's restricted to plain ASCII as well.  I suspect the
bug here, if any, is that back-sql should simply ignore that part of the
filter while converting it into an SQL query.  This is what back-sql is
supposed to do; I'll check if it works as expected.

p.

p.




Ing. Pierangelo Masarati
Responsabile Open Solution
OpenLDAP Core Team

SysNet s.n.c.
Via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
------------------------------------------
Office:   +39.02.23998309          
Mobile:   +39.333.4963172
Email:    pierangelo.masarati@sys-net.it
------------------------------------------



Followup 7

Download message
Date: Sat, 1 Jul 2006 18:25:19 +0200 (CEST)
From: Thomas Reifferscheid <reiffert@Student.Physik.Uni-Mainz.DE>
To: Pierangelo Masarati <ando@sys-net.it>
cc: openldap-its@OpenLDAP.org
Subject: Re: (ITS#4604) backs-sql: backsql_process_filter(): invalid filter
 for UTF-8 querys
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--1569075624-327967449-1151771119=:19840
Content-Type: TEXT/PLAIN; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE



On Sat, 1 Jul 2006, Pierangelo Masarati wrote:

> You shouldn't do that: changing standard track schema is not good
> practice.  As far as I remember, RFC 2822 does not allow arbitrary chars
> in the local part of an address; they might be allowed in the display
> name, but I fear it's restricted to plain ASCII as well.  I suspect the
> bug here, if any, is that back-sql should simply ignore that part of the
> filter while converting it into an SQL query.  This is what back-sql is
> supposed to do; I'll check if it works as expected.

Additional information:

when the (mail=3Dm=FCller) arrives and get's
translated to (?=3Dundefined), there is no mysql-query performed against
the mysql-db.

Hope that helps checking.

Kind regards
Thomas Reifferscheid
--1569075624-327967449-1151771119=:19840--



Followup 8

Download message
Subject: Re: (ITS#4604) backs-sql: backsql_process_filter(): invalid filter
	for UTF-8 querys
From: Pierangelo Masarati <ando@sys-net.it>
To: Thomas Reifferscheid <reiffert@Student.Physik.Uni-Mainz.DE>
Cc: openldap-its@OpenLDAP.org
Date: Sat, 01 Jul 2006 19:16:53 +0200
On Sat, 2006-07-01 at 18:25 +0200, Thomas Reifferscheid wrote:

> when the (mail=m..ller) arrives and get's
> translated to (?=undefined), there is no mysql-query performed against
> the mysql-db.

Thanks; I've spotted this separate but related issue in filter
translation; it should now be fixed in HEAD; you may pull it out of the
CVS by getting a diff of servers/slapd/back-sql/search.c  1.118 ->
1.119; it should apply smoothly to 2.3.24.  Please test at your
convenience and report thru the ITS.

p.




Ing. Pierangelo Masarati
Responsabile Open Solution
OpenLDAP Core Team

SysNet s.n.c.
Via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
------------------------------------------
Office:   +39.02.23998309          
Mobile:   +39.333.4963172
Email:    pierangelo.masarati@sys-net.it
------------------------------------------



Followup 9

Download message
Date: Sun, 2 Jul 2006 17:51:32 +0200 (CEST)
From: Thomas Reifferscheid <reiffert@Student.Physik.Uni-Mainz.DE>
To: Pierangelo Masarati <ando@sys-net.it>
cc: openldap-its@OpenLDAP.org
Subject: Re: (ITS#4604) backs-sql: backsql_process_filter(): invalid filter
 for UTF-8 querys
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--1569075624-555860190-1151855492=:23744
Content-Type: TEXT/PLAIN; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE



On Sat, 1 Jul 2006, Pierangelo Masarati wrote:

> it should now be fixed in HEAD; you may pull it out of the
> CVS by getting a diff of servers/slapd/back-sql/search.c  1.118 ->
> 1.119; it should apply smoothly to 2.3.24.  Please test at your
> convenience and report thru the ITS.

Using HEAD 1.119, search.c it works now:

conn=3D1 op=3D1 SRCH base=3D"dc=3Dfoo,dc=3Dde" scope=3D2 deref=3D0=20
filter=3D"(|(cn=3Dm\C3\BCller*)(?=3Dundefined)(sn=3Dm\C3\BCller*))"

An additional mysql query is sent afterwards:

=2E..
backsql_process_filter(): filter computed (UNDEFINED)
=2E..

<=3D=3Dbacksql_srch_query() returns SELECT DISTINCT=20
ldap_entries.id,ldap_entries.id,'inetOrgPerson' AS=20
objectClass,ldap_entries.dn AS dn FROM ldap_entries WHER
E ldap_entries.id=3Dldap_entries.keyval AND ldap_entries.oc_map_id=3D? AND =
9=3D9=20
AND 1=3D1 AND ((ldap_entries.cn LIKE 'm=C3=BCller%') OR 12=3D0 OR (ldap_ent=
ries.sn=20
LIKE
'm=C3=BCller%'))


Many thanks for fixing at the speed of of light.

Kind regards
Thomas Reifferscheid
--1569075624-555860190-1151855492=:23744--


Up to top level
Build   Contrib   Development   Documentation   Historical   Incoming   Software Bugs   Software Enhancements   Web  

Logged in as guest


The OpenLDAP Issue Tracking System uses a hacked version of JitterBug

______________
© Copyright 2013, OpenLDAP Foundation, info@OpenLDAP.org