[Date Prev][Date Next] [Chronological] [Thread] [Top]

slapd frontend contention



Over the past couple years I've noticed that slapd's frontend was a bottleneck in our benchmarks on larger systems. But I hadn't quantified the extent of the problem. This week I stumbled across mutrace, which is a very handy tool that measures how much time is spent in mutexes.

Running against back-mdb with the test008 setup, using -l 100 on the tester showed some pretty interesting data. This is a run where slapd was configured with listener-threads 2 on my quadcore laptop:

The #1 contended mutex is in the thread pool, pool->ltp_mutex.
#2 and #4 are the socket-table mutexes. (With the default of 1 listener thread they're correspondingly higher in the chart.) #3 is slap_op_mutex which is used to ensure unique timestamps in each Operation structure. #5 is the LMDB writer mutex. #6 is the attr_alloc mutex, which we should probably consider eliminating but nothing after that is very significant. 7,8,9 are connection writer mutexes, and #10 is the pending CSN list mutex.

#####
mutrace: 0.2 successfully initialized for process slapd (PID: 10868).
^C
mutrace: Showing statistics for process slapd (PID: 10868).
mutrace: 717 mutexes used.

Mutex #471 (0x0x10bae48) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	tpool.c:217	ldap_pvt_thread_pool_init()

Mutex #488 (0x0x9100f0) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	daemon.c:2906	slapd_daemon()

Mutex #497 (0x0x9107e0) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	init.c:92	slap_init()
	main.c:787	main()
	libc-start.c:258	__libc_start_main()

Mutex #487 (0x0x9100a0) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	daemon.c:1629	slapd_daemon_init()

Mutex #635 (0x0x7ffe2d78c040) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	mdb.c:3795	mdb_env_setup_locks()
	mdb.c:3906	mdb_env_open()
	init.c:174	mdb_db_open()
	backend.c:224	backend_startup_one()
	backend.c:330	backend_startup()
	init.c:219	slap_startup()
	main.c:992	main()
	libc-start.c:258	__libc_start_main()

Mutex #495 (0x0x910720) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	attr.c:708	attr_init()
	entry.c:95	entry_init()
	init.c:117	slap_init()
	main.c:787	main()
	libc-start.c:258	__libc_start_main()

Mutex #113 (0x0x7ffe2d59c248) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	connection.c:421	connection_init()

Mutex #98 (0x0x7ffe2d59b6c8) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	connection.c:421	connection_init()

Mutex #105 (0x0x7ffe2d59bc88) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	connection.c:421	connection_init()

Mutex #4 (0x0x116cf20) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	backend.c:624	backend_db_init()
	bconfig.c:1661	config_generic()
	config.c:345	config_set_vals()
	config.c:419	config_add_vals()
	config.c:783	read_config_file()
	bconfig.c:4439	read_config()
	main.c:793	main()
	libc-start.c:258	__libc_start_main()

mutrace: Showing 10 mutexes in order of (write) contention count:

 Mutex #   Locked  Changed    Cont. cont.Time[ms] tot.Time[ms] avg.Time[ms]
     471   231744   178161    41572     14511.248      634.339        0.003
     488   108544    83353     9455      2449.662      356.085        0.003
     497   102537    97750     8429      5156.696      245.420        0.002
     487    89990    73426     7462      2477.159      318.022        0.004
     635     2401     2263     1562      5838.290      905.338        0.377
     495    23079     2596      205        65.932       74.804        0.003
     113    44029     7438      165        95.214     1260.856        0.029
      98    30427     5647      146        79.943     1100.263        0.036
     105    18433     4504      119        66.397      965.679        0.052
       4     4800     3540      117        30.774       12.467        0.003
     ...      ...      ...      ...           ...          ...          ...

mutrace: Note that rwlocks are shown as two lines: write locks then read locks.

mutrace: 45 condition variables used.

Condvar #43 (0x0x10bae70) first referenced by:
	mutrace.c:1744	pthread_cond_init()
	tpool.c:220	ldap_pvt_thread_pool_init()

mutrace: Showing 1 condition variables in order of wait contention count:

  Cond #    Waits  Signals    Cont. tot.Time[ms] cont.Time[ms] avg.Time[ms]
      43    18389   106667    17090    64222.152     10010.642        0.544
     ...      ...      ...      ...          ...           ...          ...

mutrace: Total runtime is 10582.976 ms.

mutrace: Results for SMP with 4 processors.

mutrace: WARNING: 2 internal hash collisions detected. Results might not be as reliable as they could be.
mutrace:          Try to increase --hash-size=, which is currently at 67137.
#####

The total runtime is relatively arbitrary, based on how quickly I typed ^C after the slapd-tester program finished. But notice that the condition variable is contended for through almost the entire runtime. This is also in the threadpool, pool->ltp_cond.

I've now written a patch to allow a configurable number of workqueues in a threadpool. Here's the same slapd-tester workload against slapd with queues=4.

#####
mutrace -d --hash-size=67137 ../servers/slapd/slapd -f /tmp/testr/slapd.1.conf -h ldap://:9011 -s0 -d0
mutrace: 0.2 successfully initialized for process slapd (PID: 10191).
^C
mutrace: Showing statistics for process slapd (PID: 10191).
mutrace: 724 mutexes used.

Mutex #535 (0x0x90f7e0) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	init.c:92	slap_init()
	[0x65742f706d742f00] ??() ??:0

Mutex #528 (0x0x90f0f0) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	daemon.c:2906	slapd_daemon()

Mutex #527 (0x0x90f0a0) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	daemon.c:1629	slapd_daemon_init()

Mutex #691 (0x0xfd8eb0) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	tpool.c:247	ldap_pvt_thread_pool_init0()

Mutex #692 (0x0xfd8f78) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	tpool.c:247	ldap_pvt_thread_pool_init0()

Mutex #694 (0x0xfd9108) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	tpool.c:247	ldap_pvt_thread_pool_init0()

Mutex #693 (0x0xfd9040) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	tpool.c:247	ldap_pvt_thread_pool_init0()

Mutex #530 (0x0x7fa641167040) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	mdb.c:3795	mdb_env_setup_locks()
	mdb.c:3906	mdb_env_open()
	init.c:174	mdb_db_open()
	backend.c:224	backend_startup_one()
	backend.c:330	backend_startup()
	init.c:220	slap_startup()
	libc-start.c:258	__libc_start_main()

Mutex #533 (0x0x90f720) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	attr.c:708	attr_init()
	entry.c:95	entry_init()
	init.c:117	slap_init()
	[0x65742f706d742f00] ??() ??:0

Mutex #592 (0x0x7fa640f76108) first referenced by:
	mutrace.c:1448	pthread_mutex_init()
	connection.c:421	connection_init()

mutrace: Showing 10 mutexes in order of (write) contention count:
 Mutex #   Locked  Changed    Cont. cont.Time[ms] tot.Time[ms] avg.Time[ms]
     535   102537    97558    11075      6417.798      247.221        0.002
     528   119100    89722     8905      2224.464      373.760        0.003
     527   105147    81593     7472      2074.638      339.454        0.003
     691    63260    47195     2924       680.201      190.002        0.003
     692    60573    47065     2537       504.543      180.968        0.003
     694    55821    42992     2296       548.873      171.009        0.003
     693    54335    39828     2234       644.864      188.289        0.003
     530     2401     2068     1437      6222.018      865.117        0.360
     533    23079     2464      246        87.519       66.836        0.003
     592    34036     5594      236        91.546     1306.007        0.038
     ...      ...      ...      ...           ...          ...          ...

mutrace: Note that rwlocks are shown as two lines: write locks then read locks.

mutrace: 51 condition variables used.

Condvar #45 (0x0xfd8fa0) first referenced by:
	mutrace.c:1744	pthread_cond_init()
	tpool.c:250	ldap_pvt_thread_pool_init0()

Condvar #49 (0x0xfd9130) first referenced by:
	mutrace.c:1744	pthread_cond_init()
	tpool.c:250	ldap_pvt_thread_pool_init0()

Condvar #47 (0x0xfd9068) first referenced by:
	mutrace.c:1744	pthread_cond_init()
	tpool.c:250	ldap_pvt_thread_pool_init0()

Condvar #43 (0x0xfd8ed8) first referenced by:
	mutrace.c:1744	pthread_cond_init()
	tpool.c:250	ldap_pvt_thread_pool_init0()

mutrace: Showing 4 condition variables in order of wait contention count:

  Cond #    Waits  Signals    Cont. tot.Time[ms] cont.Time[ms] avg.Time[ms]
      45     5580    27496     3692    13306.017       792.845        0.142
      49     3798    26011     2333    13349.744       658.738        0.173
      47     4364    24985     2204    14182.993      1001.988        0.230
      43     3929    29665     2102    12901.160      1186.375        0.302
     ...      ...      ...      ...          ...           ...          ...

mutrace: Total runtime is 23920.267 ms.

mutrace: Results for SMP with 4 processors.
#####

Now the #1 contended mutex is the slap_op_time mutex. #2 and #3 are the socket table mutexes. #4,5,6,7 are the threadpool workqueue mutexes. The important thing to note is that while the total time all 4 of these mutexes are held adds up to about the same as the single mutex in the previous run, the total contended time is less than 1/6th of the previous run. I.e., splitting 1 mutex into 4 reduced our overhead by a factor of 6.

For the condvar, the total contended time of the 4 cvs is 2.7x less than the previous run.

The code is currently available on my mutex branch on ada.openldap.org. The pool-pause feature hasn't yet been tested. When I've finished that I'll merge to master for RE25.

I believe this patch finally gets the frontend out of LMDB's way. I've also looked at restructuring the writer mutexes, but that may prove to be a deadend, based on the signalling requirements we have.
--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/