[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#2610)



Jong,

(sorry for this huge post)

I've tried OpenLDAP HEAD, including Howard's patch, but I don't see any 
improvement when running test008.  The test was executed on an Athlon 1GHz 
with a 40GB UDMA IDE disk running under SuSE Linux 8.2, glibc 2.3.2. My guess 
is that the problem gets worse on slower hardware.

Here is the output of db_stat -c on the test-db:

*********************
62      Last allocated locker ID.
2147M   Current maximum unused locker ID.
9       Number of lock modes.
1000    Maximum number of locks possible.
1000    Maximum number of lockers possible.
1000    Maximum number of lock objects possible.
20      Number of current locks.
47      Maximum number of locks at any one time.
51      Number of current lockers.
61      Maximum number of lockers at any one time.
17      Number of current lock objects.
30      Maximum number of lock objects at any one time.
105092  Total number of locks requested.
104755  Total number of locks released.
0       Total number of lock requests failing because DB_LOCK_NOWAIT was set.
741     Total number of locks not immediately available due to conflicts.
235     Number of deadlocks.
0       Lock timeout value.
0       Number of locks that have timed out.
0       Transaction timeout value.
0       Number of transactions that have timed out.
360KB   The size of the lock region..
0       The number of region locks granted after waiting.
144735  The number of region locks granted without waiting.
*********************

Here is the output of GDB attached to slapd while running the test:

*********************
CC365526-A:/var/tmp/openldap-2.1.22 # gdb /usr/local/openldap/libexec/slapd 
29374
GNU gdb 5.3
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i586-suse-linux"...
Attaching to program: /usr/local/openldap/libexec/slapd, process 29374
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libpthread.so.0...done.
[New Thread 16384 (LWP 29340)]
[New Thread 32769 (LWP 29343)]
[New Thread 16386 (LWP 29344)]
[New Thread 32771 (LWP 29346)]
[New Thread 49156 (LWP 29366)]
[New Thread 65541 (LWP 29367)]
[New Thread 81926 (LWP 29368)]
[New Thread 98311 (LWP 29369)]
[New Thread 114696 (LWP 29370)]
[New Thread 131081 (LWP 29371)]
[New Thread 147466 (LWP 29372)]
[New Thread 163851 (LWP 29373)]
[New Thread 180236 (LWP 29374)]
[New Thread 196621 (LWP 29375)]
[New Thread 213006 (LWP 29376)]
[New Thread 229391 (LWP 29377)]
[New Thread 245776 (LWP 29378)]
[New Thread 262161 (LWP 29379)]
[New Thread 278546 (LWP 29380)]
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
0x40153f57 in sched_yield () from /lib/libc.so.6
(gdb) thread apply all bt

Thread 19 (Thread 278546 (LWP 29380)):
#0  0x400b3548 in sigsuspend () from /lib/libc.so.6
#1  0x40042228 in __pthread_wait_for_restart_signal () from 
/lib/libpthread.so.0
#2  0x4003ed80 in pthread_cond_wait@GLIBC_2.0 () from /lib/libpthread.so.0
#3  0x080ad959 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:452
#4  0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 18 (Thread 262161 (LWP 29379)):
#0  0x400b3548 in sigsuspend () from /lib/libc.so.6
#1  0x40042228 in __pthread_wait_for_restart_signal () from 
/lib/libpthread.so.0
#2  0x4003ed80 in pthread_cond_wait@GLIBC_2.0 () from /lib/libpthread.so.0
#3  0x080ad959 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:452
#4  0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 17 (Thread 245776 (LWP 29378)):
#0  0x40153f57 in sched_yield () from /lib/libc.so.6
#1  0x0813c901 in __os_yield ()
#2  0x08105743 in __db_tas_mutex_lock ()
#3  0x080e68d7 in __lock_get_internal ()
#4  0x080e5c21 in __lock_vec ()
#5  0x08123a58 in __db_lget ()
#6  0x08109c1b in __bam_search ()
#7  0x0814195e in __bam_c_search ()
#8  0x0813ea44 in __bam_c_get ()
#9  0x0811c58e in __db_c_get ()
#10 0x08116ed8 in __db_get ()
#11 0x0809bcd0 in bdb_dn2id (op=0x81d73b0, txn=0x81e8608, dn=0x473ef4f8, 
ei=0x473ef4ec) at dn2id.c:326
#12 0x080989a1 in bdb_cache_find_ndn (op=0x81d73b0, txn=0x81e8608, 
ndn=0x473ef5c4, res=0x473ef5b8, locker=2147484671) at cache.c:362
#13 0x0808c7e0 in bdb_modrdn (op=0x81d73b0, rs=0x473ef8cc) at modrdn.c:692
#14 0x08066a99 in do_modrdn (op=0x81d73b0, rs=0x473ef8cc) at modrdn.c:366
#15 0x08056355 in connection_operation (ctx=0x473ef93c, arg_v=0x81d73b0) at 
connection.c:963
#16 0x080ad7f1 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:463
#17 0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 16 (Thread 229391 (LWP 29377)):
#0  0x400b3548 in sigsuspend () from /lib/libc.so.6
#1  0x40042228 in __pthread_wait_for_restart_signal () from 
/lib/libpthread.so.0
#2  0x4003ed80 in pthread_cond_wait@GLIBC_2.0 () from /lib/libpthread.so.0
#3  0x080ad959 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:452
#4  0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 15 (Thread 213006 (LWP 29376)):
#0  0x400b3548 in sigsuspend () from /lib/libc.so.6
#1  0x40042228 in __pthread_wait_for_restart_signal () from 
/lib/libpthread.so.0
#2  0x4003ed80 in pthread_cond_wait@GLIBC_2.0 () from /lib/libpthread.so.0
#3  0x080ad959 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:452
#4  0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 14 (Thread 196621 (LWP 29375)):
#0  0x0813c8f0 in __os_yield ()
#1  0x080e68d7 in __lock_get_internal ()
#2  0x080e5c21 in __lock_vec ()
#3  0x08123a58 in __db_lget ()
#4  0x08109c1b in __bam_search ()
#5  0x0814195e in __bam_c_search ()
#6  0x0813ea44 in __bam_c_get ()
#7  0x0811c58e in __db_c_get ()
#8  0x08116ed8 in __db_get ()
#9  0x0809bcd0 in bdb_dn2id (op=0x81d8c10, txn=0x81d8e00, dn=0x464e9468, 
ei=0x464e945c) at dn2id.c:326
#10 0x080989a1 in bdb_cache_find_ndn (op=0x81d8c10, txn=0x81d8e00, 
ndn=0x81dd0fc, res=0x464e94e4, locker=2147484666) at cache.c:362
#11 0x0809b1cf in bdb_dn2entry (op=0x81d8c10, tid=0x81d8e00, dn=0x81d8c10, 
e=0x464e9548, matched=1, locker=2147484666, lock=0x464e95dc)
    at dn2entry.c:44
#12 0x080966d6 in bdb_add (op=0x81d8c10, rs=0x464e98cc) at add.c:144
#13 0x0805b0e3 in do_add (op=0x81d8c10, rs=0x464e98cc) at add.c:303
#14 0x08056325 in connection_operation (ctx=0x464e993c, arg_v=0x81d8c10) at 
connection.c:953
#15 0x080ad7f1 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:463
#16 0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 13 (Thread 180236 (LWP 29374)):
#0  0x40153f57 in sched_yield () from /lib/libc.so.6
#1  0x0813c901 in __os_yield ()
#2  0x08105743 in __db_tas_mutex_lock ()
#3  0x080e68d7 in __lock_get_internal ()
#4  0x080e5c21 in __lock_vec ()
#5  0x08123a58 in __db_lget ()
#6  0x08109c1b in __bam_search ()
#7  0x0814195e in __bam_c_search ()
#8  0x0813ea44 in __bam_c_get ()
#9  0x0811c58e in __db_c_get ()
#10 0x08116ed8 in __db_get ()
#11 0x0809bcd0 in bdb_dn2id (op=0x81d3778, txn=0x81e8598, dn=0x45fe74f8, 
ei=0x45fe74ec) at dn2id.c:326
#12 0x080989a1 in bdb_cache_find_ndn (op=0x81d3778, txn=0x81e8598, 
ndn=0x45fe75c4, res=0x45fe75b8, locker=2147484670) at cache.c:362
#13 0x0808c7e0 in bdb_modrdn (op=0x81d3778, rs=0x45fe78cc) at modrdn.c:692
#14 0x08066a99 in do_modrdn (op=0x81d3778, rs=0x45fe78cc) at modrdn.c:366
#15 0x08056355 in connection_operation (ctx=0x45fe793c, arg_v=0x81d3778) at 
connection.c:963
#16 0x080ad7f1 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:463
#17 0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 12 (Thread 163851 (LWP 29373)):
#0  0x40153f57 in sched_yield () from /lib/libc.so.6
#1  0x0813c901 in __os_yield ()
#2  0x08105743 in __db_tas_mutex_lock ()
#3  0x080e68d7 in __lock_get_internal ()
#4  0x080e5c21 in __lock_vec ()
#5  0x08123a58 in __db_lget ()
#6  0x08109c1b in __bam_search ()
#7  0x0814195e in __bam_c_search ()
#8  0x0813ea44 in __bam_c_get ()
#9  0x0811c58e in __db_c_get ()
#10 0x08116ed8 in __db_get ()
#11 0x0809bcd0 in bdb_dn2id (op=0x81d79b0, txn=0x81e6af8, dn=0x45ae5468, 
ei=0x45ae545c) at dn2id.c:326
#12 0x080989a1 in bdb_cache_find_ndn (op=0x81d79b0, txn=0x81e6af8, 
ndn=0x81d8074, res=0x45ae54e4, locker=2147484668) at cache.c:362
#13 0x0809b1cf in bdb_dn2entry (op=0x81d79b0, tid=0x81e6af8, dn=0x81d79b0, 
e=0x45ae5548, matched=1, locker=2147484668, lock=0x45ae55dc)
    at dn2entry.c:44
#14 0x080966d6 in bdb_add (op=0x81d79b0, rs=0x45ae58cc) at add.c:144
#15 0x0805b0e3 in do_add (op=0x81d79b0, rs=0x45ae58cc) at add.c:303
#16 0x08056325 in connection_operation (ctx=0x45ae593c, arg_v=0x81d79b0) at 
connection.c:953
#17 0x080ad7f1 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:463
#18 0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 11 (Thread 147466 (LWP 29372)):
#0  0x400b3548 in sigsuspend () from /lib/libc.so.6
#1  0x40042228 in __pthread_wait_for_restart_signal () from 
/lib/libpthread.so.0
#2  0x4003ed80 in pthread_cond_wait@GLIBC_2.0 () from /lib/libpthread.so.0
#3  0x080ad959 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:452
#4  0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 10 (Thread 131081 (LWP 29371)):
#0  0x40153f57 in sched_yield () from /lib/libc.so.6
#1  0x0813c901 in __os_yield ()
#2  0x08105743 in __db_tas_mutex_lock ()
#3  0x080e68d7 in __lock_get_internal ()
#4  0x080e5c21 in __lock_vec ()
#5  0x08123a58 in __db_lget ()
#6  0x08109c1b in __bam_search ()
#7  0x0814195e in __bam_c_search ()
#8  0x0813ea44 in __bam_c_get ()
#9  0x0811c58e in __db_c_get ()
#10 0x08116ed8 in __db_get ()
#11 0x0809bcd0 in bdb_dn2id (op=0x81d9838, txn=0x81e6b68, dn=0x450e1468, 
ei=0x450e145c) at dn2id.c:326
#12 0x080989a1 in bdb_cache_find_ndn (op=0x81d9838, txn=0x81e6b68, 
ndn=0x81e4bec, res=0x450e14e4, locker=2147484669) at cache.c:362
#13 0x0809b1cf in bdb_dn2entry (op=0x81d9838, tid=0x81e6b68, dn=0x81d9838, 
e=0x450e1548, matched=1, locker=2147484669, lock=0x450e15dc)
    at dn2entry.c:44
#14 0x080966d6 in bdb_add (op=0x81d9838, rs=0x450e18cc) at add.c:144
#15 0x0805b0e3 in do_add (op=0x81d9838, rs=0x450e18cc) at add.c:303
#16 0x08056325 in connection_operation (ctx=0x450e193c, arg_v=0x81d9838) at 
connection.c:953
#17 0x080ad7f1 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:463
#18 0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 9 (Thread 114696 (LWP 29370)):
#0  0x40153f57 in sched_yield () from /lib/libc.so.6
#1  0x0813c901 in __os_yield ()
#2  0x08105743 in __db_tas_mutex_lock ()
#3  0x080e68d7 in __lock_get_internal ()
#4  0x080e5c21 in __lock_vec ()
#5  0x08123a58 in __db_lget ()
#6  0x08109c1b in __bam_search ()
#7  0x0814195e in __bam_c_search ()
#8  0x0813ea44 in __bam_c_get ()
#9  0x0811c58e in __db_c_get ()
#10 0x08116ed8 in __db_get ()
#11 0x0809bcd0 in bdb_dn2id (op=0x81d7178, txn=0x81d86e8, dn=0x44bdf4f8, 
ei=0x44bdf4ec) at dn2id.c:326
#12 0x080989a1 in bdb_cache_find_ndn (op=0x81d7178, txn=0x81d86e8, 
ndn=0x44bdf5c4, res=0x44bdf5b8, locker=2147484667) at cache.c:362
#13 0x0808c7e0 in bdb_modrdn (op=0x81d7178, rs=0x44bdf8cc) at modrdn.c:692
#14 0x08066a99 in do_modrdn (op=0x81d7178, rs=0x44bdf8cc) at modrdn.c:366
#15 0x08056355 in connection_operation (ctx=0x44bdf93c, arg_v=0x81d7178) at 
connection.c:963
#16 0x080ad7f1 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:463
#17 0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 8 (Thread 98311 (LWP 29369)):
#0  0x400b3548 in sigsuspend () from /lib/libc.so.6
#1  0x40042228 in __pthread_wait_for_restart_signal () from 
/lib/libpthread.so.0
#2  0x4003ed80 in pthread_cond_wait@GLIBC_2.0 () from /lib/libpthread.so.0
#3  0x080ad959 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:452
#4  0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 7 (Thread 81926 (LWP 29368)):
#0  0x400b3548 in sigsuspend () from /lib/libc.so.6
#1  0x40042228 in __pthread_wait_for_restart_signal () from 
/lib/libpthread.so.0
#2  0x4003ed80 in pthread_cond_wait@GLIBC_2.0 () from /lib/libpthread.so.0
#3  0x080ad959 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:452
#4  0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 6 (Thread 65541 (LWP 29367)):
#0  0x40153f57 in sched_yield () from /lib/libc.so.6
#1  0x0813c901 in __os_yield ()
#2  0x08105743 in __db_tas_mutex_lock ()
#3  0x080e68d7 in __lock_get_internal ()
#4  0x080e5c21 in __lock_vec ()
#5  0x08123a58 in __db_lget ()
#6  0x08109c1b in __bam_search ()
#7  0x0814195e in __bam_c_search ()
#8  0x0813ea44 in __bam_c_get ()
#9  0x0811c58e in __db_c_get ()
#10 0x08116ed8 in __db_get ()
#11 0x0809bcd0 in bdb_dn2id (op=0x81da238, txn=0x81db790, dn=0x41cd54f8, 
ei=0x41cd54ec) at dn2id.c:326
#12 0x080989a1 in bdb_cache_find_ndn (op=0x81da238, txn=0x81db790, 
ndn=0x41cd55c4, res=0x41cd55b8, locker=2147484665) at cache.c:362
#13 0x0808c7e0 in bdb_modrdn (op=0x81da238, rs=0x41cd58cc) at modrdn.c:692
#14 0x08066a99 in do_modrdn (op=0x81da238, rs=0x41cd58cc) at modrdn.c:366
#15 0x08056355 in connection_operation (ctx=0x41cd593c, arg_v=0x81da238) at 
connection.c:963
#16 0x080ad7f1 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:463
#17 0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 5 (Thread 49156 (LWP 29366)):
#0  0x4004597f in fsync () from /lib/libpthread.so.0
#1  0x080efcc1 in __os_fsync ()
#2  0x080ec099 in __log_flush_int ()
#3  0x080eb656 in __log_flush_commit ()
#4  0x080eb344 in __log_put ()
#5  0x08102612 in __txn_regop_log ()
#6  0x081010a8 in __txn_commit ()
#7  0x080976c7 in bdb_add (op=0x81d75b8, rs=0x417d38cc) at add.c:496
#8  0x0805b0e3 in do_add (op=0x81d75b8, rs=0x417d38cc) at add.c:303
#9  0x08056325 in connection_operation (ctx=0x417d393c, arg_v=0x81d75b8) at 
connection.c:953
#10 0x080ad7f1 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:463
#11 0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 4 (Thread 32771 (LWP 29346)):
#0  0x400b3548 in sigsuspend () from /lib/libc.so.6
#1  0x40042228 in __pthread_wait_for_restart_signal () from 
/lib/libpthread.so.0
#2  0x4003ed80 in pthread_cond_wait@GLIBC_2.0 () from /lib/libpthread.so.0
#3  0x080ad959 in ldap_int_thread_pool_wrapper (xpool=0x8195998) at 
tpool.c:452
#4  0x4003fd80 in pthread_start_thread () from /lib/libpthread.so.0

Thread 3 (Thread 16386 (LWP 29344)):
#0  0x40162271 in select () from /lib/libc.so.6
#1  0x00000000 in ?? ()

Thread 2 (Thread 32769 (LWP 29343)):
#0  0x40160827 in poll () from /lib/libc.so.6
#1  0x4003f97e in __pthread_manager () from /lib/libpthread.so.0

Thread 1 (Thread 16384 (LWP 29340)):
#0  0x400b3548 in sigsuspend () from /lib/libc.so.6
#1  0x40042228 in __pthread_wait_for_restart_signal () from 
/lib/libpthread.so.0
#2  0x4003f5bf in pthread_join () from /lib/libpthread.so.0
#3  0x0805440e in slapd_daemon () at daemon.c:2006
#4  0x0804b3a1 in main (argc=1080884192, argv=0x40047054) at main.c:579
#5  0x4009f857 in __libc_start_main () from /lib/libc.so.6
#0  0x40153f57 in sched_yield () from /lib/libc.so.6
*********************

Regards,

Erik


On Tuesday 01 July 2003 17:33, Jong wrote:
> It would help to look at the problem if you post "db_stat -c" result in the
> list.
> Also wondering whether the recent congestion control patch can solve this
> problem,
> but I suspect it because "Number of deadlocks" is zero in this case.
> - Jong
>
> ----------------
> Jong Hyuk Choi
> IBM Thomas J. Watson Research Center - Enterprise Linux Group
> P.O. Box 218, Yorktown Heights, NY 10598
> email: jongchoi@us.ibm.com
> (phone) 914-945-3979 (fax) 914-945-4425 TL: 862-3979
> ----- Original Message -----
>
> From: <edg72@home.nl>
> To: <openldap-its@OpenLDAP.org>
> Sent: Thursday, June 26, 2003 5:17 PM
> Subject: subject=(ITS#2610)
>
> > Hi,
> >
> > I've done some futher investigation on this ITS and I've found out that
> > it
>
> is
>
> > BerkeleyDB that is calling ldap_pvt_thread_yield() all the time (callback
> > registered in /servers/slapd/back-bdb/init.c). This is the cause of the
>
> high
>
> > CPU usage and the large amount of context switches (over 100K/sec!) I've
> > seen. I suspect there is some bdb locking problem going on. When I run
> > db_stat -c while performing the modify benchmark I see that "Total number
>
> of
>
> > locks not immediately available due to conflicts" is steadily increasing,
>
> I
>
> > am not sure if that is a serious issue. The "Number of deadlocks" is zero
>
> and
>
> > there are no signs of BDB running out of locks/lockers/lock objects.
> > I've also added some extra debug statements in
>
> /servers/slapd/back-bdb/idl.c
>
> > because from slapd's debug logging I saw that executing the function
> > bdb_idl_delete_key() took al lot of time, sometimes more than 15 seconds!
> > I'll try to continue with digging into this problem, but I am starting to
>
> run
>
> > out of ideas. Any ideas would be welcome!
> >
> > Regards,
> >
> > Erik
> >
> > PS I forgot to mention that I use BerkeleyDB 4.1.25p1 NC