[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#6200) slapd crashes under load w/ syncrepl



We build with,

   % CPPFLAGS="-I/usr/local/krb5/include -I/usr/local/include/sasl" \
     CFLAGS="-g" ./configure ...
   % make

The 'CFLAGS' environmental will make sure the '-g' is injected into all 
generated makefiles.  You can add -D_GNU_SOURCE there as well.

We experienced the problems you described with <2.4.17.  2.4.17 fixed many 
of the segfault causes.  We didn't repeat our stress test with multimaster 
and 2.4.17+ though, since we determined that deltasync was good enough for 
our purposes.  It may very well be that 2.4.17+ would crash in multimaster 
mode with our stress test.  The traceback you provided looks very similar 
to what we were seeing.

If I have time I could run a test with 2.4.18 and multimaster and see what 
the results are...


---
Tracy Stenvik
University Computing Services 354843.  University of Washington
email: imf@u.washington.edu  voice: (206) 685-3344

On Thu, 17 Sep 2009, Joacim Breiler wrote:

> We are trying out the 2.4.18 release with a syncrepl Multi-Master setup and 
> are experiencing similar problems as mentioned above.
>
> It will shut down with a segfault after heavy processing/updating of about 
> 3000-17000 entries. We can reproduce this in about 10-20 minutes of heavy 
> load. However, in this short execution time we can't find any signs of memory 
> leakage.
>
> Unfortunatly I can't get any good tracebacks from gdb. I've tried compiling 
> openldap with:
> export CPPFLAGS="-D_GNU_SOURCE -g". And also --enable-debug with /configure. 
> Any pointers?
>
> This is the traceback I get, but I guess it isn't very helpful without the 
> complete debugging information:
>
> (gdb) where
> #0  0x00007fc7e24fb8eb in syncprov_op_mod (op=0x7fc7aedb37d0, rs=<value 
> optimized out>) at syncprov.c:1970
> #1  0x000000000049531a in overlay_op_walk ()
> #2  0x0000000000495e7a in ?? ()
> #3  0x000000000044a2e2 in fe_op_modify ()
> #4  0x000000000044ac67 in do_modify ()
> #5  0x00000000004321bf in ?? ()
> #6  0x0000000000432e6c in ?? ()
> #7  0x00000000004dd960 in ?? ()
> #8  0x00007fc7e2e9b3ea in start_thread () from /lib/libpthread.so.0
> #9  0x00007fc7e29f3cbd in clone () from /lib/libc.so.6
> #10 0x0000000000000000 in ?? ()
> (gdb)
>
> Could these problems be related? And is there something we can do to help 
> resolve it?
>
> Regards,
> Joacim Breiler
>
>
>
> Configuration:
>
> dn: olcDatabase={1}hdb,cn=config
> objectClass: olcDatabaseConfig
> objectClass: olcHdbConfig
> olcDatabase: {1}hdb
> olcDbDirectory: /srv/ldap
> olcSuffix: dc=hgo,dc=se
> olcAccess: {0}to attrs=userPassword,shadowLastChange by dn="cn=admin,dc=hgo,
> dc=se" write by anonymous auth by self write by * none
> olcLastMod: TRUE
> olcRootDN: cn=admin,dc=hgo,dc=se
> olcRootPW: xxxxxxxx
> olcSizeLimit: -1
> olcSyncrepl: {0}rid=004 provider=ldap://diablo binddn="cn=admin,dc=hgo,dc=se
> " bindmethod=simple credentials="xxxxxxx" searchbase="dc=hgo,dc=se" ty
> pe=refreshAndPersist retry="5 5 300 5" timeout=1
> olcSyncrepl: {1}rid=003 provider=ldap://mephisto binddn="cn=admin,dc=hgo,dc=
> se" bindmethod=simple credentials="xxxxxxx" searchbase="dc=hgo,dc=se"
> type=refreshAndPersist retry="5 5 300 5" timeout=1
> olcMirrorMode: TRUE
> olcDbCheckpoint: 2048 5
> olcDbIDLcacheSize: 90000
> olcDbIndex: objectClass,entryCSN,entryUUID,cn eq
> olcDbCacheSize: 10000
> olcDbConfig: {0}set_cachesize 1 0 0
> olcDbConfig: {1}set_lk_max_objects 32000
> olcDbConfig: {2}set_lk_max_locks 32000
> olcDbConfig: {3}set_lk_max_lockers 1500
> olcDbConfig: {4}set_flags DB_LOG_AUTOREMOVE
> olcDbConfig: {5}set_lg_bsize 104857600
> olcDbConfig: {6}set_lg_dir /srv/ldap
>
>
>