[Date Prev][Date Next] [Chronological] [Thread] [Top]

scope not ok errors on very large databases (ITS#3343)



Full_Name: Daniel Armbrust
Version: 2.2.17
OS: Fedora Core 2
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (129.176.151.126)


I have hit a problem where Openldap seems to stop using its indexes (or just
does something wrong with them) and instead of quickly do a one level search, it
ends up scanning the entire database, returning tons of "scope not ok"
messages.

The interesting part is that this only happens on 1 particular node in my
database.  Renaming the node to something else has no affect.  It also seems to
be size related - as when we split up the ldif that we load the database from,
such that we load this node in pieces, we can load some parts of this node, but
not all.  

For example - when I have this particular node into 3 parts - I can load part 1,
and things work.  If I then load part 3 things fail.  If I backup - and just
load part 3 - things still work.  But then when I add part 1, things fail.

Configuration details:

I have openldap 2.2.17 installed on a Fedora Core 2 machine, using a fully
patched Berkeley 4.2.52. The problem originally surfaced on my 2.2.15 install.

We have a custom schema, which I can provide if it would be useful.
The rest of my slapd.conf file looks like this:

pidfile         ndfrt.pid
schemacheck     on
idletimeout     14400
threads         150
sizelimit       6000
access to       * by * write
  
database        bdb
suffix          "service=NDF-RT,dc=LexGrid,dc=org"
rootdn          "cn=yadayada,service=NDF-RT,dc=LexGrid,dc=org"
rootpw          "something for me to know..."
directory       /localwork/ldap/database/dbndfrt/

index           objectClass eq
index           conceptCode eq
index           language pres,eq
index           dc eq
index           sourceConcept,targetConcept,association,presentationId eq
index           text,entityDescription pres,eq,sub,subany


My DB_CONFIG file looks like this:

set_flags       DB_TXN_NOSYNC
set_flags       DB_TXN_NOT_DURABLE
set_cachesize   0       102400000       1


My full ldif file is 720 MB - so it's a little hard to post.....  But I load the
entire database with slapadd.  

After the database is loaded, if I connect up to it with my Softerra Ldap
Administrator, and browse down to the problem node, everything works fine.
I have Softerra configured to use paged results - currently set to 100 items.

When I click to expand the problem node (get its immediate children) this is
what the server does (log level 1)

(the node I am expanding is "conceptCode=kc8")



connection_get(9): got connid=0
connection_read(9): checking for input on id=0
ber_get_next
ber_get_next: tag 0x30 len 251 contents:
ber_get_next
ber_get_next on fd 9 failed errno=11 (Resource temporarily unavailable)
do_search
ber_scanf fmt ({miiiib) ber:
>>> dnPrettyNormal: <conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org>
=> ldap_bv2dn(conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org,0)
ldap_err2string
<= ldap_bv2dn(conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org)=0
Success
=> ldap_dn2bv(272)
ldap_err2string
<= ldap_dn2bv(conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org)=0
Success
=> ldap_dn2bv(272)
ldap_err2string
<= ldap_dn2bv(conceptCode=kc8,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org)=0
Success
<<< dnPrettyNormal: <conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org>,
<conceptCode=kc8,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org>
ber_scanf fmt (m) ber:
ber_scanf fmt ({M}}) ber:
=> get_ctrls
ber_scanf fmt ({m) ber:
ber_scanf fmt (b) ber:
ber_scanf fmt (m) ber:
=> get_ctrls: oid="1.2.840.113556.1.4.473" (noncritical)
ber_scanf fmt ({m) ber:
ber_scanf fmt (b) ber:
ber_scanf fmt (m) ber:
=> get_ctrls: oid="1.2.840.113556.1.4.319" (critical)
ber_scanf fmt ({im}) ber:
<= get_ctrls: n=2 rc=0 err=""
==> limits_get: conn=0 op=9 dn="[anonymous]"
=> bdb_search
bdb_dn2entry("conceptCode=kc8,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org")
search_candidates: base="conceptCode=kc8,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org"
(0x0000000b) scope=1
=> bdb_dn2idl( "conceptCode=kc8,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org"
)
<= bdb_dn2idl: id=-1 first=12 last=2821415
=> bdb_presence_candidates (objectClass)
bdb_search_candidates: id=-1 first=12 last=2821415
=> send_search_entry:
dn="propertyId=P-KC8-0,conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org"
(attrsOnly)
ber_flush: 127 bytes to sd 9
<= send_search_entry
bdb_search: 13 scope not okay
bdb_search: 14 scope not okay
bdb_search: 15 scope not okay
bdb_search: 16 scope not okay
bdb_search: 17 scope not okay
bdb_search: 18 scope not okay
bdb_search: 19 scope not okay
bdb_search: 20 scope not okay
bdb_search: 21 scope not okay
bdb_search: 22 scope not okay
bdb_search: 23 scope not okay
bdb_search: 24 scope not okay
bdb_search: 25 scope not okay
bdb_search: 26 scope not okay
bdb_search: 27 scope not okay
bdb_search: 28 scope not okay
bdb_search: 29 scope not okay

<SNIP>

bdb_search: 51 scope not okay
connection_get(9): got connid=0
connection_read(9): checking for input on id=0
ber_get_next
ber_get_next: tag 0x30 len 143 contents:
ber_get_next
ber_get_next on fd 9 failed errno=11 (Resource temporarily unavailable)
bdb_search: 52 scope not okay
bdb_search: 53 scope not okay
bdb_search: 54 scope not okay
do_search
ber_scanf fmt ({miiiib) ber:
>>> dnPrettyNormal: <conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org>
=> ldap_bv2dn(conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org,0)
ldap_err2string
<= ldap_bv2dn(conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org)=0
Success
=> ldap_dn2bv(272)
ldap_err2string
<= ldap_dn2bv(conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org)=0
Success
=> ldap_dn2bv(272)
ldap_err2string
<= ldap_dn2bv(conceptCode=kc8,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org)=0
Success
<<< dnPrettyNormal: <conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org>,
<conceptCode=kc8,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org>
ber_scanf fmt (m) ber:
ber_scanf fmt ({M}}) ber:
==> limits_get: conn=0 op=10 dn="[anonymous]"
=> bdb_search
bdb_dn2entry("conceptCode=kc8,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org")
=> send_search_entry:
dn="conceptCode=KC8,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org"
ber_flush: 195 bytes to sd 9
<= send_search_entry
bdb_search: 55 scope not okay
send_ldap_result: conn=0 op=10 p=3
send_ldap_response: msgid=31 tag=101 err=0
ber_flush: 14 bytes to sd 9
bdb_search: 56 scope not okay
bdb_search: 57 scope not okay

<SNIP>

bdb_search: 221 scope not okay
entry_decode: "propertyId=P-C190-0,conceptCode=C190,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org"
<= entry_decode(propertyId=P-C190-0,conceptCode=C190,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org)
=> bdb_dn2id( "propertyId=p-c190-0,conceptCode=c190,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org"
)
<= bdb_dn2id: got id=0x000000de
bdb_search: 222 scope not okay
entry_decode: "propertyId=SearchName-1,conceptCode=C190,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org"
<= entry_decode(propertyId=SearchName-1,conceptCode=C190,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org)
=> bdb_dn2id( "propertyId=searchname-1,conceptCode=c190,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org"
)
<= bdb_dn2id: got id=0x000000df
bdb_search: 223 scope not okay
entry_decode: "conceptCode=C192,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org"
<= entry_decode(conceptCode=C192,dc=concepts,codingScheme=NDF-RT,dc=codingSchemes,service=NDF-RT,dc=LexGrid,dc=org)
=> bdb_dn2id( "conceptCode=c192,dc=concepts,codingScheme=ndf-rt,dc=codingschemes,service=ndf-rt,dc=lexgrid,dc=org"
)
<= bdb_dn2id: got id=0x000000e0
bdb_search: 224 scope not okay

And then it continues this until the timeout limit is reached, and throws an
error back to the client.

The last time I saw this error was when there was a bug in the paged results
code - but this occurs no matter what the paged result setting is.  There are
other large nodes in this database, and they all work correctly.  I have also
loaded over 2 GB of ldif into other openldap databases before, and not run into
this error.  What I don't know, however, is if I have ever loaded this many
entries under one node before.  This node itself is about 190 MB worth of ldif. 
So I could be hitting a limitation (or bug) there that I have never tickled
before.