OpenLDAP
Up to top level
Build   Contrib   Development   Documentation   Historical   Incoming   Software Bugs   Software Enhancements   Web  

Logged in as guest

Viewing Archive.Incoming/3343
Full headers

From: daniel.armbrust@mayo.edu
Subject: scope not ok errors on very large databases
Compose comment
Download message
State:
0 replies:
9 followups: 1 2 3 4 5 6 7 8 9

Major security issue: yes  no

Notes:

Notification:


Date: Wed, 22 Sep 2004 14:43:39 GMT
From: daniel.armbrust@mayo.edu
To: openldap-its@OpenLDAP.org
Subject: scope not ok errors on very large databases
Full_Name: Daniel Armbrust
Version: 2.2.17
OS: Fedora Core 2
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (129.176.151.126)


I have hit a problem where Openldap seems to stop using its indexes (or just
does something wrong with them) and instead of quickly do a one level search, it
ends up scanning the entire database, returning tons of "scope not ok"
messages.

The interesting part is that this only happens on 1 particular node in my
database.  Renaming the node to something else has no affect.  It also seems to
be size related - as when we split up the ldif that we load the database from,
such that we load this node in pieces, we can load some parts of this node, but
not all.  

For example - when I have this particular node into 3 parts - I can load part 1,
and things work.  If I then load part 3 things fail.  If I backup - and just
load part 3 - things still work.  But then when I add part 1, things fail.

Configuration details:

I have openldap 2.2.17 installed on a Fedora Core 2 machine, using a fully
patched Berkeley 4.2.52. The problem originally surfaced on my 2.2.15 install.

We have a custom schema, which I can provide if it would be useful.
The rest of my slapd.conf file looks like this:

pidfile         ndfrt.pid
schemacheck     on
idletimeout     14400
threads         150
sizelimit       6000
access to       * by * write
  
database        bdb
suffix          "service=NDF-RT,dc=LexGrid,dc=org"
rootdn          "cn=yadayada,service=NDF-RT,dc=LexGrid,dc=org"
rootpw          "something for me to know..."
directory       /localwork/ldap/database/dbndfrt/

index           objectClass eq
index           conceptCode eq
index           language pres,eq
index           dc eq
index           sourceConcept,targetConcept,association,presentationId eq
index           text,entityDescription pres,eq,sub,subany


My DB_CONFIG file looks like this:

set_flags       DB_TXN_NOSYNC
set_flags       DB_TXN_NOT_DURABLE
set_cachesize   0       102400000       1


My full ldif file is 720 MB - so it's a little hard to post.....  But I load the
entire database with slapadd.  

After the database is loaded, if I connect up to it with my Softerra Ldap
Administrator, and browse down to the problem node, everything works fine.
I have Softerra configured to use paged results - currently set to 100 items.

When I click to expand the problem node (get its immediate children) this is
what the server does (log level 1)

(the node I am expanding is "conceptCode=kc8")



connection_get(9): got connid=0
connection_read(9): checking for input on id=0
ber_get_next
ber_get_next: tag 0x30 len 251 contents:
ber_get_next
ber_get_next on fd 9 failed errno=11 (Resource temporarily unavailable)
do_search
ber_scanf fmt ({miiiib) ber:
>>> dnPrettyNormal:
<conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org>
=> ldap_bv2dn(conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org,0)
ldap_err2string
<= ldap_bv2dn(conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org)=0
Success
=> ldap_dn2bv(272)
ldap_err2string
<= ldap_dn2bv(conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org)=0
Success
=> ldap_dn2bv(272)
ldap_err2string
<= ldap_dn2bv(conceptCode=kc8,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org)=0
Success
<<< dnPrettyNormal:
<conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org>,
<conceptCode=kc8,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org>
ber_scanf fmt (m) ber:
ber_scanf fmt ({M}}) ber:
=> get_ctrls
ber_scanf fmt ({m) ber:
ber_scanf fmt (b) ber:
ber_scanf fmt (m) ber:
=> get_ctrls: oid="1.2.840.113556.1.4.473" (noncritical)
ber_scanf fmt ({m) ber:
ber_scanf fmt (b) ber:
ber_scanf fmt (m) ber:
=> get_ctrls: oid="1.2.840.113556.1.4.319" (critical)
ber_scanf fmt ({im}) ber:
<= get_ctrls: n=2 rc=0 err=""
==> limits_get: conn=0 op=9 dn="[anonymous]"
=> bdb_search
bdb_dn2entry("conceptCode=kc8,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org")
search_candidates: base="conceptCode=kc8,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org"
(0x0000000b) scope=1
=> bdb_dn2idl( "conceptCode=kc8,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org"
)
<= bdb_dn2idl: id=-1 first=12 last=2821415
=> bdb_presence_candidates (objectClass)
bdb_search_candidates: id=-1 first=12 last=2821415
=> send_search_entry:
dn="propertyId=P-KC8-0,conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org"
(attrsOnly)
ber_flush: 127 bytes to sd 9
<= send_search_entry
bdb_search: 13 scope not okay
bdb_search: 14 scope not okay
bdb_search: 15 scope not okay

Message of length 9482 truncated

Followup 1

Download message
Date: Fri, 01 Oct 2004 02:37:47 -0700
From: Howard Chu <hyc@symas.com>
To: daniel.armbrust@mayo.edu
CC: openldap-its@OpenLDAP.org
Subject: Re: scope not ok errors on very large databases (ITS#3343)
daniel.armbrust@mayo.edu wrote:

>Full_Name: Daniel Armbrust
>Version: 2.2.17
>OS: Fedora Core 2
>URL: ftp://ftp.openldap.org/incoming/
>Submission from: (NULL) (129.176.151.126)
>
>
>I have hit a problem where Openldap seems to stop using its indexes (or just
>does something wrong with them) and instead of quickly do a one level
search, it
>ends up scanning the entire database, returning tons of "scope not ok"
>messages.
>
>The interesting part is that this only happens on 1 particular node in my
>database.  Renaming the node to something else has no affect.  It also seems
to
>be size related - as when we split up the ldif that we load the database
from,
>such that we load this node in pieces, we can load some parts of this node,
but
>not all.  
>  
>
Exactly how large is this "node" ? You say it fails when you do a 
one-level search under it - how many immediate children does it have? 
There is a known limitation in back-bdb's index design; when any index 
slot hits 65536 entries it gets converted from an explicit list of 
entries into a "range". If the entries in this slot were not added in 
sorted order, then the range may span a large portion of the database.

For example, assuming the slot size was 4, and you had an index slot 
with entry IDs
     2,6,25,57
if you added a new entry under this slot, entry ID 99, this index slot 
would be converted into a range
     2-99
which would include quite a large number of entries that really have 
nothing to do with that slot.

You can tweak the slot sizes in back-bdb/idl.h BDB_IDL_DB_SIZE and 
BDB_IDL_UM_SIZE and recompile. I believe UM_SIZE must always be at least 
twice the DB_SIZE. You will also need to dump the database to LDIF 
before making this change, and reload from scratch afterward.

Also, loading your database in sorted order will help minimize the 
impact of this problem. I.e., make sure that all of the children of a 
particular node are loaded contiguously, without other intervening 
entries. This only helps when the DIT is relatively flat.

Originally back-hdb did not have this problem, although it does now 
because it shares the same search/indexing mechanism.

-- 
  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support



Followup 2

Download message
From: "Armbrust, Daniel C." <Armbrust.Daniel@mayo.edu>
To: "'Howard Chu'" <hyc@symas.com>
Cc: openldap-its@OpenLDAP.org
Subject: RE: scope not ok errors on very large databases (ITS#3343)
Date: Fri, 1 Oct 2004 14:42:54 -0500 
Thanks for the info.  I'll try changing the parameters and reloading it next
week sometime.

We have 259,423 direct children and about 690,264 total children under the
problem node.

Dan



-----Original Message-----
From: Howard Chu [mailto:hyc@symas.com] 
Sent: Friday, October 01, 2004 4:38 AM
To: Armbrust, Daniel C.
Cc: openldap-its@OpenLDAP.org
Subject: Re: scope not ok errors on very large databases (ITS#3343)

daniel.armbrust@mayo.edu wrote:

>Full_Name: Daniel Armbrust
>Version: 2.2.17
>OS: Fedora Core 2
>URL: ftp://ftp.openldap.org/incoming/
>Submission from: (NULL) (129.176.151.126)
>
>
>I have hit a problem where Openldap seems to stop using its indexes (or just
>does something wrong with them) and instead of quickly do a one level
search, it
>ends up scanning the entire database, returning tons of "scope not ok"
>messages.
>
>The interesting part is that this only happens on 1 particular node in my
>database.  Renaming the node to something else has no affect.  It also seems
to
>be size related - as when we split up the ldif that we load the database
from,
>such that we load this node in pieces, we can load some parts of this node,
but
>not all.  
>  
>
Exactly how large is this "node" ? You say it fails when you do a 
one-level search under it - how many immediate children does it have? 
There is a known limitation in back-bdb's index design; when any index 
slot hits 65536 entries it gets converted from an explicit list of 
entries into a "range". If the entries in this slot were not added in 
sorted order, then the range may span a large portion of the database.

For example, assuming the slot size was 4, and you had an index slot 
with entry IDs
     2,6,25,57
if you added a new entry under this slot, entry ID 99, this index slot 
would be converted into a range
     2-99
which would include quite a large number of entries that really have 
nothing to do with that slot.

You can tweak the slot sizes in back-bdb/idl.h BDB_IDL_DB_SIZE and 
BDB_IDL_UM_SIZE and recompile. I believe UM_SIZE must always be at least 
twice the DB_SIZE. You will also need to dump the database to LDIF 
before making this change, and reload from scratch afterward.

Also, loading your database in sorted order will help minimize the 
impact of this problem. I.e., make sure that all of the children of a 
particular node are loaded contiguously, without other intervening 
entries. This only helps when the DIT is relatively flat.

Originally back-hdb did not have this problem, although it does now 
because it shares the same search/indexing mechanism.

-- 
  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support



Followup 3

Download message
From: "Armbrust, Daniel C." <Armbrust.Daniel@mayo.edu>
To: Howard Chu <hyc@symas.com>
Cc: openldap-its@OpenLDAP.org
Subject: RE: scope not ok errors on very large databases (ITS#3343)
Date: Wed, 6 Oct 2004 15:46:40 -0500 
 
I changed the values you recommended to:
Code from openldap-2.2.17/servers/slapd/back-bdb/idl.h:

/* IDL sizes - likely should be even bigger
 *   limiting factors: sizeof(ID), thread stack size
 */
#define BDB_IDL_DB_SIZE         (1<<18) /* 64K IDL on disk - dan modified
to 256K */
#define BDB_IDL_UM_SIZE         (1<<19) /* 128K IDL in memory - dan
modifed to 512K*/


And now I get a segmentation fault when I run "make test"

>>>>> Starting test003-search ...
running defines.sh
Running slapadd to build slapd database...
Running slapindex to index slapd database...
Starting slapd on TCP/IP port 9011...
Testing slapd searching...
Waiting 5 seconds for slapd to start...
Testing exact searching...
Testing approximate searching...
Testing OR searching...
Testing AND matching and ends-with searching...
./scripts/test003-search: line 100:  7856 Segmentation fault      $SLAPD -f
$CONF1 -h $URI1 -d $LVL $TIMING >$LOG1 2>&1
ldapsearch failed (255)!
./scripts/test003-search: line 104: kill: (7856) - No such process
>>>>> ./scripts/test003-search failed (exit 255)
make[2]: *** [bdb-yes] Error 255
make[2]: Leaving directory `/home/armbrust/temp/openldap-2.2.17/tests'
make[1]: *** [test] Error 2
make[1]: Leaving directory `/home/armbrust/temp/openldap-2.2.17/tests'
make: *** [test] Error 2

Did I mess up changing the params?
Dan



Followup 4

Download message
Date: Wed, 06 Oct 2004 15:02:35 -0700
From: Howard Chu <hyc@symas.com>
To: "Armbrust, Daniel C." <Armbrust.Daniel@mayo.edu>
CC: openldap-its@OpenLDAP.org
Subject: Re: [JunkMail] RE: scope not ok errors on very large databases (ITS#3343)
Hm.... Would need a gdb stack trace to be sure, but most likely the new 
size is too large for the regular thread stack. You'll need to increase 
the size of LDAP_PVT_THREAD_STACK_SIZE and recompile libldap_r to change 
that, and relink slapd.

Armbrust, Daniel C. wrote:

> 
>I changed the values you recommended to:
>Code from openldap-2.2.17/servers/slapd/back-bdb/idl.h:
>
>/* IDL sizes - likely should be even bigger
> *   limiting factors: sizeof(ID), thread stack size
> */
>#define BDB_IDL_DB_SIZE         (1<<18) /* 64K IDL on disk - dan
modified to 256K */
>#define BDB_IDL_UM_SIZE         (1<<19) /* 128K IDL in memory - dan
modifed to 512K*/
>
>
>And now I get a segmentation fault when I run "make test"
>
>  
>
>>>>>>Starting test003-search ...
>>>>>>            
>>>>>>
>running defines.sh
>Running slapadd to build slapd database...
>Running slapindex to index slapd database...
>Starting slapd on TCP/IP port 9011...
>Testing slapd searching...
>Waiting 5 seconds for slapd to start...
>Testing exact searching...
>Testing approximate searching...
>Testing OR searching...
>Testing AND matching and ends-with searching...
>./scripts/test003-search: line 100:  7856 Segmentation fault      $SLAPD -f
$CONF1 -h $URI1 -d $LVL $TIMING >$LOG1 2>&1
>ldapsearch failed (255)!
>./scripts/test003-search: line 104: kill: (7856) - No such process
>  
>
>>>>>>./scripts/test003-search failed (exit 255)
>>>>>>            
>>>>>>
>make[2]: *** [bdb-yes] Error 255
>make[2]: Leaving directory `/home/armbrust/temp/openldap-2.2.17/tests'
>make[1]: *** [test] Error 2
>make[1]: Leaving directory `/home/armbrust/temp/openldap-2.2.17/tests'
>make: *** [test] Error 2
>
>Did I mess up changing the params?
>Dan
>
>
>  
>


-- 
  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support



Followup 5

Download message
From: "Armbrust, Daniel C." <Armbrust.Daniel@mayo.edu>
To: Howard Chu <hyc@symas.com>
Cc: openldap-its@OpenLDAP.org
Subject: RE: scope not ok errors on very large databases (ITS#3343)
Date: Wed, 6 Oct 2004 16:38:57 -0500 
 
Further data point.... If I only double (instead of quadruple) the values - so
now I'm using 

#define BDB_IDL_DB_SIZE         (1<<17) /* 64K IDL on disk - dan modified
to 128K */
#define BDB_IDL_UM_SIZE         (1<<18) /* 128K IDL in memory - dan
modifed to 256K*/

And now all make tests pass.  I don't think this will be enough extra size to
fix my problem, however... I'm starting a new load right now to determine if the
behavior has changed at all.

Dan



Followup 6

Download message
From: "Armbrust, Daniel C." <Armbrust.Daniel@mayo.edu>
To: Howard Chu <hyc@symas.com>
Cc: openldap-its@OpenLDAP.org
Subject: RE: scope not ok errors on very large databases (ITS#3343)
Date: Thu, 7 Oct 2004 11:00:46 -0500 
 
Followup to the last post - when I reloaded my problematic database on openldap
2.2.17 using these parameters:

#define BDB_IDL_DB_SIZE         (1<<17) /* 64K IDL on disk - dan modified
to 128K */
#define BDB_IDL_UM_SIZE         (1<<18) /* 128K IDL in memory - dan
modified to 256K*/

The problem that I reported in the initial post went away.  I am now able to
view/search, etc, on this large node in my database.  This surprised me, because
I didn't think that I had increased the size of the key enough to fix the
problem.  It must be because my ~260,000ish entries are being split across
multiple keys (I'm not sure why, the are almost all aliases)

Am I likely to run into any other problems by using these larger values?  If
not, is there a reason not to update openldap itself to use these larger values?

Thanks,

Dan



Followup 7

Download message
From: "Armbrust, Daniel C." <Armbrust.Daniel@mayo.edu>
To: Howard Chu <hyc@symas.com>
Cc: openldap-its@OpenLDAP.org
Subject: RE: scope not ok errors on very large databases (ITS#3343)
Date: Thu, 7 Oct 2004 11:38:16 -0500 
Ps - Howard, you were right again with the LDAP_PVT_THREAD_STACK_SIZE.

I tried changing the multiple from 4 to 8, and then changed the other variables
back to a bit shift of 18 and 19, and rebuilt all of openldap, and this time all
of the make tests passed.


I suppose this issue is a matter of balancing the scalability of openldap vs its
ability to run on a machine with limited RAM.  I'll add these changes to my
notes, and hopefully remember to set them before all future builds. 

Thanks for your expertise! 

Dan

-----Original Message-----
From: Howard Chu [mailto:hyc@symas.com] 
Sent: Wednesday, October 06, 2004 5:03 PM
To: Armbrust, Daniel C.
Cc: openldap-its@OpenLDAP.org
Subject: Re: [JunkMail] RE: scope not ok errors on very large databases
(ITS#3343)

Hm.... Would need a gdb stack trace to be sure, but most likely the new 
size is too large for the regular thread stack. You'll need to increase 
the size of LDAP_PVT_THREAD_STACK_SIZE and recompile libldap_r to change 
that, and relink slapd.



Followup 8

Download message
Date: Thu, 07 Oct 2004 14:36:35 -0700
From: Howard Chu <hyc@symas.com>
To: "Armbrust, Daniel C." <Armbrust.Daniel@mayo.edu>
CC: openldap-its@OpenLDAP.org
Subject: RE: scope not ok errors on very large databases (ITS#3343)
Armbrust, Daniel C. wrote:

> 
>Followup to the last post - when I reloaded my problematic database on
openldap 2.2.17 using these parameters:
>
>#define BDB_IDL_DB_SIZE         (1<<17) /* 64K IDL on disk - dan
modified to 128K */
>#define BDB_IDL_UM_SIZE         (1<<18) /* 128K IDL in memory - dan
modified to 256K*/
>
>The problem that I reported in the initial post went away.  I am now able to
view/search, etc, on this large node in my database.  This surprised me, because
I didn't think that I had increased the size of the key enough to fix the
problem.  It must be because my ~260,000ish entries are being split across
multiple keys (I'm not sure why, the are almost all aliases)
>
>Am I likely to run into any other problems by using these larger values?  If
not, is there a reason not to update openldap itself to use these larger values?
>  
>
The main issue is memory usage. Every IDL slot is 4 bytes, so 256K of 
them is 1024KB of memory. back-bdb preallocates a search stack for every 
thread; this stack is configurable in slapd.conf but defaults to 8 
chunks, so that's 8*1024KB = 8MB. Also, one or two of them may need to 
fit on the regular thread stack, as you already saw. All of this adds up 
quickly, especially if you have a large number of threads configured. 
These default values were chosen a long time ago, when slapd still 
defaulted to 32 threads (as opposed to 16 now), and were reasonable for 
a typical 32-bit machine. But obviously there's room here for tuning, 
and if you were to create a 64-bit build you'd probably want even larger 
sizes.

-- 
  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support



Followup 9

Download message
From: "Armbrust, Daniel C." <Armbrust.Daniel@mayo.edu>
To: openldap-its@openldap.org
Subject: RE: scope not ok errors on very large databases (ITS#3343)
Date: Wed, 23 Feb 2005 08:31:48 -0600
Possibly related information I forgot to put in the initial report:
I have made the following modifications to my instance of 2.2.23 (because of
this bug http://www.openldap.org/its/index.cgi?findid=3343):
In the file 'servers/slapd/back-bdb/idl.h' I modify these two lines:
#define BDB_IDL_DB_SIZE (1<<16) /* 64K IDL on disk*/
#define BDB_IDL_UM_SIZE (1<<17) /* 128K IDL in memory*/

so that they read:
#define BDB_IDL_DB_SIZE (1<<18) /* 256K IDL on disk*/
#define BDB_IDL_UM_SIZE (1<<19) /* 512K IDL in memory*/

in the file 'include/ldap_pvt_thread.h' on the line that says:
#define LDAP_PVT_THREAD_STACK_SIZE (4*1024*1024)

change it to:
#define LDAP_PVT_THREAD_STACK_SIZE (8*1024*1024) 


Up to top level
Build   Contrib   Development   Documentation   Historical   Incoming   Software Bugs   Software Enhancements   Web  

Logged in as guest


The OpenLDAP Issue Tracking System uses a hacked version of JitterBug

______________
© Copyright 2013, OpenLDAP Foundation, info@OpenLDAP.org