OpenLDAP
Up to top level
Build   Contrib   Development   Documentation   Historical   Incoming   Software Bugs   Software Enhancements   Web  

Logged in as guest

Viewing Incoming/6696
Full headers

From: Andrew.Gray@unlv.edu
Subject: back-sql and pagedResultsControl can be extremely heavy due to no LIMIT
Compose comment
Download message
State:
0 replies:
3 followups: 1 2 3

Major security issue: yes  no

Notes:

Notification:


Date: Thu, 04 Nov 2010 15:49:40 +0000
From: Andrew.Gray@unlv.edu
To: openldap-its@OpenLDAP.org
Subject: back-sql and pagedResultsControl can be extremely heavy due to no LIMIT
Full_Name: Andrew Gray
Version: 2.4.17
OS: Debian 5.0
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (131.216.14.1)


On receiving LDAP queries with a pagedResultsControl (in this case with a size
of 250), back-sql generates an extremely inefficient query for every iteration
in the form of:

SELECT DISTINCT ldap_entries.id,people.local_id,text('UNLVexpperson') AS
objectClass,ldap_entries.dn AS dn FROM ldap_entries,people,ldap_entry_objclasses
WHERE people.local_
id=ldap_entries.keyval AND ldap_entries.oc_map_id=1 AND upper(ldap_entries.dn)
LIKE upper('%'||'%OU=PEOPLE,DC=UNLV,DC=EDU') AND ldap_entries.id>250 AND (2=2
OR
(ldap_entries.id=ldap_entry_objclasses.entry_id AND ldap_entry_objclasses.oc_
name='UNLVexpperson'))

(this repeats for id>250, id>500, id>750, etc. etc.)

Ideally (IMO) there really should be a SQL LIMIT applied here, as in this case
slapd gets back a few tens of thousands of rows on every iteration, and the
memory usage explodes and eventually gets killed.

Followup 1

Download message
Date: Tue, 09 Nov 2010 14:40:25 +0100
From: Pierangelo Masarati <masarati@aero.polimi.it>
To: Andrew.Gray@unlv.edu
CC: openldap-its@openldap.org
Subject: Re: (ITS#6696) back-sql and pagedResultsControl can be extremely
 heavy	due to no LIMIT
Andrew.Gray@unlv.edu wrote:
> Full_Name: Andrew Gray
> Version: 2.4.17
> OS: Debian 5.0
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (131.216.14.1)
> 
> 
> On receiving LDAP queries with a pagedResultsControl (in this case with a
size
> of 250), back-sql generates an extremely inefficient query for every
iteration
> in the form of:
> 
> SELECT DISTINCT ldap_entries.id,people.local_id,text('UNLVexpperson') AS
> objectClass,ldap_entries.dn AS dn FROM
ldap_entries,people,ldap_entry_objclasses
> WHERE people.local_
> id=ldap_entries.keyval AND ldap_entries.oc_map_id=1 AND
upper(ldap_entries.dn)
> LIKE upper('%'||'%OU=PEOPLE,DC=UNLV,DC=EDU') AND ldap_entries.id>250 AND
(2=2 OR
> (ldap_entries.id=ldap_entry_objclasses.entry_id AND
ldap_entry_objclasses.oc_
> name='UNLVexpperson'))
> 
> (this repeats for id>250, id>500, id>750, etc. etc.)
> 
> Ideally (IMO) there really should be a SQL LIMIT applied here, as in this
case
> slapd gets back a few tens of thousands of rows on every iteration, and the
> memory usage explodes and eventually gets killed.

Using back-sql on large databases along with pagedResult control is not 
advisable.  Limiting the number of entries returned by each query is not 
viable as well, since some entries might not mathc the LDAP filter, or 
ACLs or so, possibly leading to less than pageSize entries returned 
within one page.  PagedResults could be removed from back-sql, and dealt 
with by an overlay that simply pages results returned by back-sql in a 
single internal search; probably this is the preferable approach, since 
it would also result in a reduction of the complexity of back-sql. 
However, I have little interest in improving back-sql, so patches are 
welcome, as usual...

p.



Followup 2

Download message
Date: Tue, 09 Nov 2010 18:54:36 -0800
From: Howard Chu <hyc@symas.com>
To: masarati@aero.polimi.it
CC: openldap-its@openldap.org
Subject: Re: (ITS#6696) back-sql and pagedResultsControl can be extremely
 heavy	due to no LIMIT
masarati@aero.polimi.it wrote:
> Andrew.Gray@unlv.edu wrote:
>> Full_Name: Andrew Gray
>> Version: 2.4.17
>> OS: Debian 5.0
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (131.216.14.1)
>>
>>
>> On receiving LDAP queries with a pagedResultsControl (in this case with
a size
>> of 250), back-sql generates an extremely inefficient query for every
iteration
>> in the form of:
>>
>> SELECT DISTINCT ldap_entries.id,people.local_id,text('UNLVexpperson')
AS
>> objectClass,ldap_entries.dn AS dn FROM
ldap_entries,people,ldap_entry_objclasses
>> WHERE people.local_
>> id=ldap_entries.keyval AND ldap_entries.oc_map_id=1 AND
upper(ldap_entries.dn)
>> LIKE upper('%'||'%OU=PEOPLE,DC=UNLV,DC=EDU') AND ldap_entries.id>250
AND (2=2 OR
>> (ldap_entries.id=ldap_entry_objclasses.entry_id AND
ldap_entry_objclasses.oc_
>> name='UNLVexpperson'))
>>
>> (this repeats for id>250, id>500, id>750, etc. etc.)
>>
>> Ideally (IMO) there really should be a SQL LIMIT applied here, as in
this case
>> slapd gets back a few tens of thousands of rows on every iteration, and
the
>> memory usage explodes and eventually gets killed.
>
> Using back-sql on large databases along with pagedResult control is not
> advisable.  Limiting the number of entries returned by each query is not
> viable as well, since some entries might not mathc the LDAP filter, or
> ACLs or so, possibly leading to less than pageSize entries returned
> within one page.  PagedResults could be removed from back-sql, and dealt
> with by an overlay that simply pages results returned by back-sql in a
> single internal search; probably this is the preferable approach, since
> it would also result in a reduction of the complexity of back-sql.
> However, I have little interest in improving back-sql, so patches are
> welcome, as usual...

The sssvlv overlay already intercepts pagedResults requests if they occur in 
combination with the Sort control. It would be trivial to extend it to always 
intercept pagedResults, and then we can rip the paging support out of each of 
the backends. (Of course, there's a marginal efficiency advantage to letting 
back-bdb/hdb do its own paging. A configurable option might be best.)

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/



Followup 3

Download message
Date: Wed, 10 Nov 2010 09:26:31 +0100
From: Pierangelo Masarati <masarati@aero.polimi.it>
To: Howard Chu <hyc@symas.com>
CC: openldap-its@openldap.org
Subject: Re: (ITS#6696) back-sql and pagedResultsControl can be extremely
 heavy	due to no LIMIT
Howard Chu wrote:

>> Using back-sql on large databases along with pagedResult control is not
>> advisable.  Limiting the number of entries returned by each query is
not
>> viable as well, since some entries might not mathc the LDAP filter, or
>> ACLs or so, possibly leading to less than pageSize entries returned
>> within one page.  PagedResults could be removed from back-sql, and
dealt
>> with by an overlay that simply pages results returned by back-sql in a
>> single internal search; probably this is the preferable approach, since
>> it would also result in a reduction of the complexity of back-sql.
>> However, I have little interest in improving back-sql, so patches are
>> welcome, as usual...
> 
> The sssvlv overlay already intercepts pagedResults requests if they 
> occur in combination with the Sort control. It would be trivial to 
> extend it to always intercept pagedResults, and then we can rip the 
> paging support out of each of the backends. (Of course, there's a 
> marginal efficiency advantage to letting back-bdb/hdb do its own paging. 
> A configurable option might be best.)

That's more or less what I had in mind.  I assume you merged the two 
functionalities in one overlay because pagedResult needs special care 
when combined with SSSVLV, and this might be true for other 
functionalities, though (e.g. having efficient pagedResult; life would 
be much better without it, since clients do not need it while it makes 
servers' life harder).  With respect to conditionally exploiting native 
pagedResult capabilities of back-bdb/hdb I only fear some issues related 
to glued databases.  Those could be possibly solved by disabling native 
back-bdb/hdb pagedResult handling when used in glued databases, or even 
more granularly, when a search spans more than one database in a glued 
configuration, delegating the handling to the overlay in those cases.

p.


Up to top level
Build   Contrib   Development   Documentation   Historical   Incoming   Software Bugs   Software Enhancements   Web  

Logged in as guest


The OpenLDAP Issue Tracking System uses a hacked version of JitterBug

______________
© Copyright 2013, OpenLDAP Foundation, info@OpenLDAP.org