[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: SLAPD hangs on existing connections, but accepts new one



You like are attempting to use a preemptive threading system
with slapd.  As noted in the FAQ and elsewhere, 1.2 supports
only non-preemptive threading systems.  2.0 removes this
restriction.

I note that the problem might be something else all together.
I'm just pointing out the obvious.

Kurt

At 11:46 PM 4/23/01, Marion.Thyen@t-mobil.de wrote:
>Hi,
>
>at some point slapd stops processing new requests on existing 
>connections. I am using version 1.2.11 on Solaris 2.7.
>
>The scenario was the following: Three clients are searching for 
>entries and one client is modifying entries. All clients work on 
>the same 900 objects in parallel. Each client sends a new request 
>as soon as he receives the answer for the last request. This is 
>ment to be a simple stress test.
>After about 40.000.000 requests (sum) I restarted all my clients. 
>After another 15.000.000 requests (sum) all my clients hang - 
>waiting for slapd.
>
>The monitoring facility outputs the lines:
>        CN=MONITOR
>        version=slapd 1.2.11-Release (Thu Apr  5 09:14:27 MET DST 2001)
>        threads=5
>        connection=5 : 20010412075524Z : 101 : 101 : cn=Manager, dc=wapldap,
>dc=detemobil, dc=de :
>        connection=13 : 20010420072202Z : 3790771 : 3790771 : cn=Manager,
>dc=wapldap, dc=detemobil, dc=de :
>        connection=16 : 20010420072202Z : 3981891 : 3981891 : cn=Manager,
>dc=wapldap, dc=detemobil, dc=de :
>        connection=17 : 20010420072202Z : 3968004 : 3968004 : cn=Manager,
>dc=wapldap, dc=detemobil, dc=de :
>        connection=18 : 20010423074721Z : 1 : 0 : NULLDN :
>        connection=19 : 20010420072202Z : 3971366 : 3971366 : cn=Manager,
>dc=wapldap, dc=detemobil, dc=de :
>        currentconnections=6
>        totalconnections=39
>        dtablesize=1024
>        writewaiters=0
>        readwaiters=0
>        opsinitiated=54672536
>        opscompleted=54672535
>        entriessent=41531895
>        bytessent=1823511052
>        currenttime=20010423074721Z
>        starttime=20010412075522Z
>        nbackends=1
>
>My slapd process does not look very healthy - it lost connection 
>to allmost all its shared library. I quote the output of "lsof":
>
>./lsof_4.55 -p 17944
>COMMAND   PID USER   FD   TYPE        DEVICE   SIZE/OFF     NODE NAME
>slapd   17944 root  cwd   VDIR         155,0       1536        2 /
>slapd   17944 root  txt   VREG         155,0      36316   155378
>/usr/lib/libpthread.so.1
>slapd   17944 root    0u  VCHR          13,2      0t260   788442
>/devices/pseudo/mm@0:null
>slapd   17944 root    1u  VCHR          13,2      0t260   788442
>/devices/pseudo/mm@0:null
>slapd   17944 root    2u  VCHR          13,2      0t260   788442
>/devices/pseudo/mm@0:null
>slapd   17944 root    3u  inet 0x30005b332e8        0t0      TCP *:ldap
>(LISTEN)
>slapd   17944 root    4w  VCHR          21,0        0t0   788438
>/devices/pseudo/log@0:conslog->LOG
>slapd   17944 root    5u  inet 0x30005b32488    0t11426      TCP
>dxcsx6:ldap->dxcsx6:44948 (ESTABLISHED)
>slapd   17944 root    6uW VREG           0,1    1023992 11512841 /tmp (swap)
>slapd   17944 root    7uW VREG           0,1  131449038 12598323 /tmp (swap)
>slapd   17944 root    8r  DOOR         196,0        0t0     7884
>/etc/.name_service_door (door to nscd[370])
>slapd   17944 root    9uW VREG           0,1    3043324 10922789 /tmp (swap)
>slapd   17944 root   10uW VREG           0,1    1585165 12598843 /tmp (swap)
>slapd   17944 root   11uW VREG           0,1    1593357 12598563 /tmp (swap)
>slapd   17944 root   12uW VREG           0,1     122895 12599123 /tmp (swap)
>slapd   17944 root   13u  inet 0x30002b9f9d8 0x22ba0a58      TCP
>dxcsx6:ldap->dxcsx3:63563 (ESTABLISHED)
>slapd   17944 root   14uW VREG           0,1      81928 10922909 /tmp (swap)
>slapd   17944 root   15uW VREG           0,1    2806042 11660500 /tmp (swap)
>slapd   17944 root   16u  inet 0x30005fc6d78 0x525d03d6      TCP
>dxcsx6:ldap->dxcsx3:63564 (ESTABLISHED)
>slapd   17944 root   17u  inet 0x30005b32ac8 0x52137b2c      TCP
>dxcsx6:ldap->dxcsx3:apd   17944 root   19u  inet 0x3000
>6fb10a0 0x522543f3      TCP dxcsx6:ldap->dxcsx3:63565 (ESTABLISHED)
>
>A healthy slapd has additional file descriptors to:
>/usr/local/openldap-1.2.11-debug/libexec/slapd
>/usr/lib/libthread.so.1
>/usr/platform/sun4u/lib/libc_psr.so.1
>/usr/lib/libaio.so.1
>/usr/lib/libc.so.1
>/usr/lib/libmp.so.2
>/usr/lib/libnsl.so.1
>/usr/lib/librt.so.1
>/usr/lib/libsocket.so.1
>/usr/lib/libgen.so.1
>/usr/lib/libresolv.so.2
>/usr/local/lib/libgdbm.so.2.0.0
>/usr/lib/libdl.so.1
>/usr/lib/ld.so.1
>
>At the current state slapd accepts new connections and processes 
>requests as expected. But my clients just hang on the existing 
>connections. Although this error is not easy to produce, we have 
>observed it more than twice on different machines. We saw the bug 
>also with openldap-1.2.7.
>
>Has anybody had the same problems? Can anybody give me advice to 
>fix this bug or to circumvent it? Should I migrate to version 2.0.7? 
>
>Thanks,
>
>Marion