[Date Prev][Date Next] [Chronological] [Thread] [Top]

Problem with Openldap and BDB - Machine freezes



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
 
Hi list

I hope someone can help me out. I have a strange situation here, and I
have tried bunches of stuff to fix the problem. Here are some info
about my setup:

Gentoo Linux on IBM iSeries (ppc64), kernel is 2.6.10 (768 MB RAM
allocated)
Software: openldap-2.2.27 and db-4.2.52 (patched with patch.4.2.52.1
and patch.4.2.52.2)

- -- slapd.conf
include         /etc/openldap/schema/core.schema
include         /etc/openldap/schema/cosine.schema
include         /etc/openldap/schema/inetorgperson.schema
include         /etc/openldap/schema/nis.schema
include         /etc/openldap/schema/samba.schema
pidfile         /var/run/openldap/slapd.pid
argsfile        /var/run/openldap/slapd.args
sizelimit       10000
timelimit       30
idletimeout     15
database        bdb
directory       /var/lib/openldap-bdb
conn_max_pending        300
threads         8
checkpoint      128 15
cachesize       10000
idlcachesize    10000
suffix          "dc=brenntag,dc=com"
rootdn          "cn=Manager,dc=brenntag,dc=com"
index           objectClass             eq
index           cn                      pres,sub,eq
index           sn                      pres,sub,eq
index           uid                     pres,sub,eq
index           displayName             pres,sub,eq
index           uidNumber               eq
index           gidNumber               eq
index           memberUid               eq
index           sambaSID                eq
index           sambaPrimaryGroupSID    eq
index           sambaDomainName         eq
index           default                 sub
rootpw  {SSHA}XXXXXXXXXXXXXXXXX
replogfile /var/log/replicate/slurpd.replog
replica host=10.17.151.3:389
        tls=no
        binddn="cn=Manager,dc=brenntag,dc=com"
        bindmethod=simple
        credentials=secretpassword
<and the acl part, which I doubt is important>

- -- DB_CONFIG
set_cachesize 0 33554432 0
set_lg_regionmax 262144
set_lg_bsize 2097152
set_lg_dir              /var/log/openldap (seperate partition)
- --

- -- /proc/meminfo
MemTotal:       753152 kB
MemFree:        519088 kB
Buffers:         55804 kB
Cached:         110856 kB
SwapCached:          0 kB
Active:         177468 kB
Inactive:        36400 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       753152 kB
LowFree:        519088 kB
SwapTotal:     1004052 kB
SwapFree:      1004052 kB
Dirty:               0 kB
Writeback:           0 kB
Mapped:          67712 kB
Slab:            16016 kB
CommitLimit:   1380628 kB
Committed_AS:   208944 kB
PageTables:       1124 kB
VmallocTotal: 2147483647 kB
VmallocUsed:       504 kB
VmallocChunk: 2147483143 kB
- --

- -- # ls -l /var/lib/openldap-bdb
- -rw-r--r--  1 ldap ldap      173 Jul 26 13:28 DB_CONFIG
- -rw-------  1 ldap ldap     8192 Jul 26 13:28 __db.001
- -rw-------  1 ldap ldap 41951232 Jul 26 13:28 __db.002
- -rw-------  1 ldap ldap  2359296 Jul 26 13:28 __db.003
- -rw-------  1 ldap ldap   565248 Jul 26 13:28 __db.004
- -rw-------  1 ldap ldap    16384 Jul 26 13:28 __db.005
- -rw-------  1 ldap ldap   520192 Aug  1 13:26 cn.bdb
- -rw-------  1 ldap ldap   266240 Aug  1 08:39 displayName.bdb
- -rw-------  1 ldap ldap   307200 Aug  1 13:26 dn2id.bdb
- -rw-------  1 ldap ldap    36864 Aug  1 13:26 gidNumber.bdb
- -rw-------  1 ldap ldap  2031616 Aug  1 22:33 id2entry.bdb
- -rw-------  1 ldap ldap    40960 Aug  1 05:03 memberUid.bdb
- -rw-------  1 ldap ldap   151552 Aug  1 13:26 objectClass.bdb
- -rw-------  1 ldap ldap    28672 Aug  1 13:26 sambaDomainName.bdb
- -rw-------  1 ldap ldap    36864 Aug  1 13:26 sambaPrimaryGroupSID.bdb
- -rw-------  1 ldap ldap    36864 Aug  1 13:26 sambaSID.bdb
- -rw-------  1 ldap ldap   151552 Aug  1 02:00 sn.bdb
- -rw-------  1 ldap ldap   417792 Aug  1 13:26 uid.bdb
- -rw-------  1 ldap ldap    40960 Aug  1 13:26 uidNumber.bdb
- --

As you guys can see, this is an openldap backend for my samba setup.

The problem is when the machinery has been running for some time, it
starts to freeze in periods. The machine is only able to answer echo
replies from icmp requests. Nothing else. When I see through the
logfile I see this:

- -- slapd.log
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18085
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: conn=18087 fd=23 ACCEPT from
IP=10.17.151.3:36574 (IP=0.0.0.0:389)
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18085
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18086
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: conn=18088 fd=24 ACCEPT from
IP=10.17.151.2:51138 (IP=0.0.0.0:389)
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18086
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18087
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: conn=18089 fd=29 ACCEPT from
IP=10.17.151.3:36575 (IP=0.0.0.0:389)
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18087
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18088
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: conn=18090 fd=30 ACCEPT from
IP=10.17.151.3:36576 (IP=0.0.0.0:389)
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18088
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18089
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: conn=18091 fd=31 ACCEPT from
IP=10.17.151.3:36577 (IP=0.0.0.0:389)
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18089
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18090
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: conn=18092 fd=32 ACCEPT from
IP=10.17.151.2:51139 (IP=0.0.0.0:389)
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18090
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18091
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: conn=18093 fd=33 ACCEPT from
IP=10.17.151.2:51140 (IP=0.0.0.0:389)
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18091
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18092
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18093
deferring operation: binding
Aug  1 22:32:13 area51 slapd[7638]: connection_input: conn=18093
deferring operation: binding
- --

I also see this during a normal day, but not as much as this, and
nothing which the users can feel (we have approx. 350 users). I have
tried different versions of Openldap so I guess it's not the problem.
I'm thinking more of the DB_CONFIG file and bdb. I have read some docs
about it, and as the doc says there is no exactly way to do this. I
have to try some different settings. I tried many different cache
sizes, but every attempt ended up with the above. Sometimes whitin 2
days, sometimes not before 30 days. I really find this strange.

When I look at my files in openldap-bdb directory I see the __db.*
files haven't been touched in a while, how come? Could that be a problem?

- From time to time I see lines like this in the slapd.log too:

Aug  1 00:12:56 area51 slapd[5830]: <= bdb_equality_candidates:
(uniqueMember) index_param failed (18)
Aug  1 00:12:56 area51 slapd[5665]: <= bdb_equality_candidates:
(uniqueMember) index_param failed (18)
Aug  1 00:12:56 area51 slapd[5830]: <= bdb_equality_candidates:
(uniqueMember) index_param failed (18)

Here is the output from db_stat:

40MB 2KB 24B    Total cache size.
1       Number of caches.
40MB 8KB        Pool individual cache size.
0       Requested pages mapped into the process' address space.
2613548 Requested pages found in the cache (100%).
26      Requested pages not found in the cache.
595     Pages created in the cache.
26      Pages read into the cache.
1183    Pages written from the cache to the backing file.
0       Clean pages forced from the cache.
0       Dirty pages forced from the cache.
0       Dirty pages written by trickle-sync thread.
621     Current total page count.
621     Current clean page count.
0       Current dirty page count.
4099    Number of hash buckets used for page location.
2614195 Total number of times hash chains searched for a page.
2       The longest hash chain searched for a page.
2739893 Total number of hash buckets examined for page location.
5553628 The number of hash bucket locks granted without waiting.
1       The number of hash bucket locks granted after waiting.
1       The maximum number of times any hash bucket lock was waited for.
2452    The number of region locks granted without waiting.
0       The number of region locks granted after waiting.
686     The number of page allocations.
0       The number of hash buckets examined during allocations
0       The max number of hash buckets examined for an allocation
0       The number of pages examined during allocations
0       The max number of pages examined for an allocation
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: sambaPrimaryGroupSID.bdb
4096    Page size.
0       Requested pages mapped into the process' address space.
61459   Requested pages found in the cache (100%).
2       Requested pages not found in the cache.
7       Pages created in the cache.
2       Pages read into the cache.
26      Pages written from the cache to the backing file.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: sambaDomainName.bdb
4096    Page size.
0       Requested pages mapped into the process' address space.
56170   Requested pages found in the cache (100%).
2       Requested pages not found in the cache.
5       Pages created in the cache.
2       Pages read into the cache.
25      Pages written from the cache to the backing file.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: uid.bdb
4096    Page size.
0       Requested pages mapped into the process' address space.
519426  Requested pages found in the cache (100%).
2       Requested pages not found in the cache.
100     Pages created in the cache.
2       Pages read into the cache.
174     Pages written from the cache to the backing file.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: memberUid.bdb
4096    Page size.
0       Requested pages mapped into the process' address space.
48868   Requested pages found in the cache (100%).
2       Requested pages not found in the cache.
8       Pages created in the cache.
2       Pages read into the cache.
54      Pages written from the cache to the backing file.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: sambaSID.bdb
4096    Page size.
0       Requested pages mapped into the process' address space.
54742   Requested pages found in the cache (100%).
2       Requested pages not found in the cache.
7       Pages created in the cache.
2       Pages read into the cache.
27      Pages written from the cache to the backing file.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: displayName.bdb
4096    Page size.
0       Requested pages mapped into the process' address space.
288531  Requested pages found in the cache (100%).
2       Requested pages not found in the cache.
63      Pages created in the cache.
2       Pages read into the cache.
94      Pages written from the cache to the backing file.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: gidNumber.bdb
4096    Page size.
0       Requested pages mapped into the process' address space.
64357   Requested pages found in the cache (100%).
2       Requested pages not found in the cache.
7       Pages created in the cache.
2       Pages read into the cache.
31      Pages written from the cache to the backing file.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: uidNumber.bdb
4096    Page size.
0       Requested pages mapped into the process' address space.
35590   Requested pages found in the cache (100%).
2       Requested pages not found in the cache.
8       Pages created in the cache.
2       Pages read into the cache.
26      Pages written from the cache to the backing file.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: sn.bdb
4096    Page size.
0       Requested pages mapped into the process' address space.
171256  Requested pages found in the cache (100%).
2       Requested pages not found in the cache.
35      Pages created in the cache.
2       Pages read into the cache.
56      Pages written from the cache to the backing file.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: cn.bdb
4096    Page size.
0       Requested pages mapped into the process' address space.
711713  Requested pages found in the cache (100%).
2       Requested pages not found in the cache.
125     Pages created in the cache.
2       Pages read into the cache.
209     Pages written from the cache to the backing file.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: objectClass.bdb
4096    Page size.
0       Requested pages mapped into the process' address space.
519068  Requested pages found in the cache (100%).
2       Requested pages not found in the cache.
35      Pages created in the cache.
2       Pages read into the cache.
93      Pages written from the cache to the backing file.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: dn2id.bdb
4096    Page size.
0       Requested pages mapped into the process' address space.
52724   Requested pages found in the cache (100%).
2       Requested pages not found in the cache.
73      Pages created in the cache.
2       Pages read into the cache.
115     Pages written from the cache to the backing file.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: id2entry.bdb
16384   Page size.
0       Requested pages mapped into the process' address space.
29644   Requested pages found in the cache (100%).
2       Requested pages not found in the cache.
122     Pages created in the cache.
2       Pages read into the cache.
253     Pages written from the cache to the backing file.
- --

I hope someone can help me out or at least point me in the right
direction. Please let me know if you need more info.

Speak to you soon and thanks.

Sincerly yours,
Jacob Lindberg
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
 
iD8DBQFC7o7bLhpaQ+AxfdwRAmXtAKCoGGwJgihY6UZ+SvFo0En6/jLP6wCfcR4B
r3hJLFqUwSB2VeZwlon2eEM=
=aIMx
-----END PGP SIGNATURE-----