[Date Prev][Date Next] [Chronological] [Thread] [Top]

calloc failures with concurrent writers (ITS#277)



Full_Name: Yuri Rabover
Version: 1.2.6 OPENLDAP_REL_ENG_1_2
OS: Solaris 2.6
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (216.111.12.67)


I am testing OpenLDAP 1.2.6 with SleepyCAT 2.7.7 on Solaris 2.6 on
a 2 CPU Sparcstation 20. 

showrev -p 
Patch: 105401-16 Obsoletes: 105524-01 Requires:  Incompatibles:  Packages:
SUNWcsu, SUNWcsr, SUNWarc, SUNWnisu
Patch: 105181-09 Obsoletes: 105214-01, 105636-01, 105776-01, 106031-02,
106308-01 Requires:  Incompatibles:  Packages: SUNWcsu, SUNWcsr, SUNWcar,
SUNWhea
Patch: 105562-03 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu,
SUNWnisu
Patch: 105210-17 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu,
SUNWarc
Patch: 105216-03 Obsoletes:  Requires: 105401-07 Incompatibles:  Packages:
SUNWcsu
Patch: 105621-08 Obsoletes: 105686-02, 105845-01, 106064-01, 106075-01 Requires:
 Incompatibles:  Packages: SUNWcsu, SUNWcsr, SUNWarc, SUNWhea, SUNWnisu
Patch: 105393-07 Obsoletes: 106033-01 Requires: 105621-04 Incompatibles: 
Packages: SUNWcsu
Patch: 105395-03 Obsoletes: 105518-01, 105736-01 Requires:  Incompatibles: 
Packages: SUNWcsu, SUNWcsr, SUNWnisu
Patch: 105615-04 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
Patch: 105665-03 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
Patch: 106049-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
Patch: 106257-04 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
Patch: 106242-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWdtbas
Patch: 105800-03 Obsoletes:  Requires: 106125-05 Incompatibles:  Packages:
SUNWadmap
Patch: 106193-03 Obsoletes: 106350-01 Requires:  Incompatibles:  Packages:
SUNWadmap
Patch: 105558-03 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWdtdte,
SUNWdtdst
Patch: 105837-02 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWdtdte
Patch: 105566-05 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWdtdmn,
SUNWdtdst
Patch: 106222-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWoldst
Patch: 105375-09 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWluxal,
SUNWluxop
Patch: 105552-02 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWnisu
Patch: 106235-02 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWpcu,
SUNWpsu
Patch: 105357-02 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWses
Patch: 105356-07 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWssadv
Patch: 105926-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWsutl
Patch: 106125-05 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWswmt
Patch: 105407-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWvolu
Patch: 106040-10 Obsoletes: 105189-03 Requires:  Incompatibles:  Packages:
SUNWxi18n, SUNWxim


Patch: 106271-04 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu,
SUNWnisu
Patch: 105755-06 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
Patch: 106301-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
Patch: 106439-02 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
Patch: 106448-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
Patch: 105490-05 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu,
SUNWcsr, SUNWarc, SUNWbtool, SUNWhea, SUNWtoo, SUNWosdem, SUNWxcu4
Patch: 106226-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsu
Patch: 105379-05 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsr
Patch: 105786-06 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsr
Patch: 105720-06 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsr
Patch: 105797-05 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWcsr
Patch: 105600-06 Obsoletes:  Requires: 105181-05 Incompatibles:  Packages:
SUNWcsr, SUNWhea
Patch: 105284-16 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWmfrun
Patch: 105464-01 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWxwopt,
SUNWxwman
Patch: 105669-04 Obsoletes:  Requires:  Incompatibles:  Packages: SUNWdtbas

I am running 2 ldapadd commands from 2 separate windows:

ldapadd -h localhost -p 9009  -w secret -D "cn=Manager,dc=3Cube,dc=com" -r -c
</tmp/l100 >/dev/null

The difference between 2 commands is the file name: in the second case
it is /tmp/l200.

Both files contain 1200 LDIF entries which are added to the empty
database. Entries in both files are not overlapped but they do belong
to a common subtree. Commands executed separately work fine. When they
are executed simultaneosly, the server returns:

calloc of 710820164 elems of 4 bytes failed

The number of elements varies slightly but if you translate it into
ASCII you will find that it looks like *ME or something like that,
which resembles substrings found in the LDIF files. There are chances
that the stack is being overwritten. Adding abort() after this debug
message produces the following stack trace:

#0  0xef593460 in __sigprocmask ()
#1  0xef58b02c in _resetsig ()
#2  0xef58a8f0 in _sigon ()
#3  0xef58d4fc in _thrp_kill ()
#4  0xef5fa4e8 in abort ()
#5  0x28354 in ch_calloc (nelem=710820164, size=4) at ch_malloc.c:59
#6  0x33e54 in idl_alloc (nids=710820162) at idl.c:23
#7  0x3594c in idl_dup (idl=0x8129a8) at idl.c:786
#8  0x33f90 in idl_fetch_one (be=0x0, db=0xa30e4, key={data = 0xef1432d0, 
      size = 5, ulen = 0, dlen = 0, doff = 0, flags = 0}) at idl.c:81
#9  0x3473c in idl_insert_key (be=0xa2b58, db=0xa30e4, key={data = 0xef1432d0, 
      size = 5, ulen = 0, dlen = 0, doff = 0, flags = 0}, id=127) at idl.c:336
#10 0x36610 in change_value (be=0xa2b58, db=0xa30e4, type=0x80f670 "dn", 
    indextype=42, val=0xef143b68 "^AB", id=127, 
    idl_func=0x346e4 <idl_insert_key>) at index.c:226
#11 0x36afc in index_change_values (be=0xa2b58, type=0x80f670 "dn", 
    vals=0xef143be0, id=127, op=1) at index.c:393
#12 0x36024 in index_add_entry (be=0xa2b58, e=0x80c328) at index.c:52
#13 0x2f93c in ldbm_back_add (be=0xa2b58, conn=0xaddf0, op=0x7f7cb8, 
    e=0x80c328) at add.c:191
#14 0x216cc in do_add (conn=0xaddf0, op=0x7f7cb8) at add.c:128
#15 0x1f288 in connection_operation (arg_v=0x80f950) at connection.c:51

Once the server dumped core and it happened deep under solaris malloc
in t_delete, which in my experience usually represents some racing
conditions in the multi-threaded code, where 2 threads work against the
same arena. Eventually, the arena becomes corrupted and t_delete dumps
core.

The server was built using the following configure command:

env ac_cv_func_pthread_create=no ol_cv_kthread_flag=no \
    ol_cv_pthread_flag=no ol_cv_pthreads_flag=no \
    ol_cv_thread_flag=no \
    CPPFLAGS="-I/usr/local/BerkeleyDB/include" \
    LDFLAGS="-L/usr/local/BerkeleyDB/lib" \
   ./configure  --with-ldbm-api=db2

Using standard 1.2.6 with various thread options (like LIBS=-lpthread -lposix4)
and other combinations produces the same result. Building --without-threads
causes this issue to dissapear but it serializes all access to the server
which makes it unsuitable for our tasks.

Feel free to ask any questions, this issue is very important to us.

Regards,
             Yuri Rabover