[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: calloc failures with concurrent writers (ITS#277)



yurir@3Cube.com wrote:

> Full_Name: Yuri Rabover
> Version: 1.2.6 OPENLDAP_REL_ENG_1_2
> OS: Solaris 2.6
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (216.111.12.67)
>
> I am testing OpenLDAP 1.2.6 with SleepyCAT 2.7.7 on Solaris 2.6 on
> a 2 CPU Sparcstation 20.
>
> showrev -p
> Patch: 105401-16 Obsoletes: 105524-01 Requires:  Incompatibles:  Packages:
> SUNWcsu, SUNWcsr, SUNWarc, SUNWnisu
> Patch: 105181-09 Obsoletes: 105214-01, 105636-01, 105776-01, 106031-02,
> 106308-01 Requires:  Incompatibles:  Packages: SUNWcsu, SUNWcsr, SUNWcar,
> SUNWhea
> Patch: 105562-03 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu,
> SUNWnisu
> Patch: 105210-17 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu,
> SUNWarc
> Patch: 105216-03 Obsoletes:  Requires: 105401-07 Incompatibles:  Packages:
> SUNWcsu
> Patch: 105621-08 Obsoletes: 105686-02, 105845-01, 106064-01, 106075-01 Requires:
>  Incompatibles:  Packages: SUNWcsu, SUNWcsr, SUNWarc, SUNWhea, SUNWnisu
> Patch: 105393-07 Obsoletes: 106033-01 Requires: 105621-04 Incompatibles:
> Packages: SUNWcsu
> Patch: 105395-03 Obsoletes: 105518-01, 105736-01 Requires:  Incompatibles:
> Packages: SUNWcsu, SUNWcsr, SUNWnisu
> Patch: 105615-04 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
> Patch: 105665-03 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
> Patch: 106049-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
> Patch: 106257-04 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
> Patch: 106242-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWdtbas
> Patch: 105800-03 Obsoletes:  Requires: 106125-05 Incompatibles:  Packages:
> SUNWadmap
> Patch: 106193-03 Obsoletes: 106350-01 Requires:  Incompatibles:  Packages:
> SUNWadmap
> Patch: 105558-03 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWdtdte,
> SUNWdtdst
> Patch: 105837-02 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWdtdte
> Patch: 105566-05 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWdtdmn,
> SUNWdtdst
> Patch: 106222-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWoldst
> Patch: 105375-09 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWluxal,
> SUNWluxop
> Patch: 105552-02 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWnisu
> Patch: 106235-02 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWpcu,
> SUNWpsu
> Patch: 105357-02 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWses
> Patch: 105356-07 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWssadv
> Patch: 105926-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWsutl
> Patch: 106125-05 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWswmt
> Patch: 105407-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWvolu
> Patch: 106040-10 Obsoletes: 105189-03 Requires:  Incompatibles:  Packages:
> SUNWxi18n, SUNWxim
>
> Patch: 106271-04 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu,
> SUNWnisu
> Patch: 105755-06 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
> Patch: 106301-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
> Patch: 106439-02 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
> Patch: 106448-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
> Patch: 105490-05 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu,
> SUNWcsr, SUNWarc, SUNWbtool, SUNWhea, SUNWtoo, SUNWosdem, SUNWxcu4
> Patch: 106226-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
> Patch: 105379-05 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsr
> Patch: 105786-06 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsr
> Patch: 105720-06 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsr
> Patch: 105797-05 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsr
> Patch: 105600-06 Obsoletes:  Requires: 105181-05 Incompatibles:  Packages:
> SUNWcsr, SUNWhea
> Patch: 105284-16 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWmfrun
> Patch: 105464-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWxwopt,
> SUNWxwman
> Patch: 105669-04 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWdtbas
>
> I am running 2 ldapadd commands from 2 separate windows:
>
> ldapadd -h localhost -p 9009  -w secret -D "cn=Manager,dc=3Cube,dc=com" -r -c
> </tmp/l100 >/dev/null
>
> The difference between 2 commands is the file name: in the second case
> it is /tmp/l200.
>
> Both files contain 1200 LDIF entries which are added to the empty
> database. Entries in both files are not overlapped but they do belong
> to a common subtree. Commands executed separately work fine. When they
> are executed simultaneosly, the server returns:
>
> calloc of 710820164 elems of 4 bytes failed
>
> The number of elements varies slightly but if you translate it into
> ASCII you will find that it looks like *ME or something like that,
> which resembles substrings found in the LDIF files. There are chances
> that the stack is being overwritten. Adding abort() after this debug
> message produces the following stack trace:
>
> #0  0xef593460 in __sigprocmask ()
> #1  0xef58b02c in _resetsig ()
> #2  0xef58a8f0 in _sigon ()
> #3  0xef58d4fc in _thrp_kill ()
> #4  0xef5fa4e8 in abort ()
> #5  0x28354 in ch_calloc (nelem=710820164, size=4) at ch_malloc.c:59
> #6  0x33e54 in idl_alloc (nids=710820162) at idl.c:23
> #7  0x3594c in idl_dup (idl=0x8129a8) at idl.c:786
> #8  0x33f90 in idl_fetch_one (be=0x0, db=0xa30e4, key={data = 0xef1432d0,
>       size = 5, ulen = 0, dlen = 0, doff = 0, flags = 0}) at idl.c:81
> #9  0x3473c in idl_insert_key (be=0xa2b58, db=0xa30e4, key={data = 0xef1432d0,
>       size = 5, ulen = 0, dlen = 0, doff = 0, flags = 0}, id=127) at idl.c:336
> #10 0x36610 in change_value (be=0xa2b58, db=0xa30e4, type=0x80f670 "dn",
>     indextype=42, val=0xef143b68 "^AB", id=127,
>     idl_func=0x346e4 <idl_insert_key>) at index.c:226
> #11 0x36afc in index_change_values (be=0xa2b58, type=0x80f670 "dn",
>     vals=0xef143be0, id=127, op=1) at index.c:393
> #12 0x36024 in index_add_entry (be=0xa2b58, e=0x80c328) at index.c:52
> #13 0x2f93c in ldbm_back_add (be=0xa2b58, conn=0xaddf0, op=0x7f7cb8,
>     e=0x80c328) at add.c:191
> #14 0x216cc in do_add (conn=0xaddf0, op=0x7f7cb8) at add.c:128
> #15 0x1f288 in connection_operation (arg_v=0x80f950) at connection.c:51
>
> Once the server dumped core and it happened deep under solaris malloc
> in t_delete, which in my experience usually represents some racing
> conditions in the multi-threaded code, where 2 threads work against the
> same arena. Eventually, the arena becomes corrupted and t_delete dumps
> core.
>
> The server was built using the following configure command:
>
> env ac_cv_func_pthread_create=no ol_cv_kthread_flag=no \
>     ol_cv_pthread_flag=no ol_cv_pthreads_flag=no \
>     ol_cv_thread_flag=no \
>     CPPFLAGS="-I/usr/local/BerkeleyDB/include" \
>     LDFLAGS="-L/usr/local/BerkeleyDB/lib" \
>    ./configure  --with-ldbm-api=db2
>
> Using standard 1.2.6 with various thread options (like LIBS=-lpthread -lposix4)
> and other combinations produces the same result. Building --without-threads
> causes this issue to dissapear but it serializes all access to the server
> which makes it unsuitable for our tasks.
>
> Feel free to ask any questions, this issue is very important to us.
>
> Regards,
>              Yuri Rabover

Yuri,

There is a multi-threaded malloc library for Solaris, (at least with 2.7).
Try adding -lmtmalloc to your LIBS definition.  We are also working
with OPENLDAP_REL_ENG_1_2 on a multi-processor Solaris
box and were seeing some problems in the memory allocation calls.
I don't think we have seen them since we started to use the multi-threaded
malloc library.

-Jeff Romine