[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#6853) slapadd/slapindex -q hang



--On Friday, March 04, 2011 9:08 PM +0000 hyc@symas.com wrote:

> dhawes@vt.edu wrote:
>> On 03/03/2011 02:38 PM, Quanah Gibson-Mount wrote:
>>> --On Thursday, March 03, 2011 7:34 PM +0000 dhawes@vt.edu wrote:
>>>
>>>> Full_Name: David Hawes
>>>> Version: 2.4.24
>>>> OS: Ubuntu 10.04
>>>> URL:
>>>> Submission from: (NULL) (128.173.39.26)
>>>>
>>>>
>>>> When using slapadd or slapindex with the -q option, the message
>>>> "Closing DB..." is printed and then the application hangs
>>>> indefinitely. Removing the -q option allows the application to
>>>> complete without issue.
>>>>
>>>> This occurs with Berkeley DB 4.7.25 (with patches) and 5.1.25.
>>>
>>> I would ask you provide a full backtrace of the slapadd process after it
>>> has hung. Otherwise, this report isn't of much use.
>>>
>>> Also, if you are using the Ubuntu patches for OpenLDAP with your
>>> OpenLDAP build, you are including a known database-corrupting patch.
>>> Since you don't say how you built OpenLDAP, it is impossible for us to
>>> know if you did this or not.
>>
>> Both OpenLDAP and Berkeley DB are compiled from source.  No Ubuntu
>> packages or code is used.
>>
>> Backtraces (I may need to recompile without optimization):
>>
>> (gdb) thread apply all bt
>>
>> Thread 2 (Thread 0x7ffee9003700 (LWP 29225)):
>> # 0  0x00007ffff763c85c in pthread_cond_wait@@GLIBC_2.3.2 ()
>>      from /lib/libpthread.so.0
>> # 1  0x00000000004b4150 in bdb_tool_trickle_task (ctx=<value optimized
>> # out>,
>>       ptr=<value optimized out>) at tools.c:1253
>> # 2  0x00000000005066b0 in ldap_int_thread_pool_wrapper (
>>       xpool=<value optimized out>) at tpool.c:685
>> # 3  0x00007ffff76379ca in start_thread () from /lib/libpthread.so.0
>> # 4  0x00007ffff677970d in clone () from /lib/libc.so.6
>> # 5  0x0000000000000000 in ?? ()
>
> This indicates that the trickle task is still waiting for a signal on its
> condition variable. Which is a bit odd since bdb_tool_entry_close()
> already  signals it before slap_tool_destroy() is called.
>
> It might be illuminating to run slapadd under gdb with a breakpoint on
> bdb_tool_entry_close(), and singlestep through the first few lines of
> that  function where it issues the signal, and see if the trickle task
> actually  reacts or not.
>
>> Thread 1 (Thread 0x7ffff7fd9700 (LWP 29220)):
>> # 0  0x00007ffff763c85c in pthread_cond_wait@@GLIBC_2.3.2 ()
>>      from /lib/libpthread.so.0
>> # 1  0x0000000000506223 in ldap_pvt_thread_pool_destroy
>> (tpool=0x7ffffffed658,
>>       run_pending=<value optimized out>) at tpool.c:582
>> # 2  0x0000000000506a0a in ldap_int_thread_pool_shutdown () at
>> # tpool.c:181 3  0x00000000005050a9 in ldap_pvt_thread_destroy () at
>> # threads.c:70 4  0x0000000000466059 in slap_destroy () at init.c:273
>> # 5  0x00000000004a5ade in slap_tool_destroy () at slapcommon.c:932
>> # 6  0x00000000004a46e7 in slapadd (argc=0, argv=<value optimized out>)
>>       at slapadd.c:606
>> # 7  0x000000000041edc0 in main (argc=4, argv=0x7fffffffe048) at
>> # main.c:407

We are seeing numerous reports of this occurring with Zimbra after using 
OpenLDAP 2.4.23 + the multi-core fix (ITS#6660)

--Quanah



--

Quanah Gibson-Mount
Sr. Member of Technical Staff
Zimbra, Inc
A Division of VMware, Inc.
--------------------
Zimbra ::  the leader in open source messaging and collaboration