[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#3534) Syncrepl producer fails on abnormal exit from consumer



Full_Name: Roger Winslow
Version: 2.2.23
OS: Linux
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (67.114.146.162)


OpenLDAP version 2.2.23
BerkeleyDB version 4.2.52.2
OpenSSL version 0.9.7e

When running a ldap server as a syncrepl producer and a syncrepl consumer is
connected over a TLS channel and the consumer is terminated midway through a
replication process the producer dies.  This problem has also been observed on
Solaris 9.

The consumers are using refreshAndPersist.  

Here is the debug from the producer under log level 1 when it dies->

connection_write(10): waking output for id=1
connection_get(10): got connid=1
connection_read(10): checking for input on id=1
ber_get_next
ber_get_next: tag 0x30 len 5 contents:
connection_input: conn=1 deferring operation: awaiting write
connection_get(10): got connid=1
connection_write(10): waking output for id=1
connection_get(10): got connid=1
connection_read(10): checking for input on id=1
ber_get_next
TLS trace: SSL3 alert read:warning:close notify
ber_get_next on fd 10 failed errno=0 (Success)
connection_read(10): input error=-2 id=1, closing.
connection_closing: readying conn=1 sd=10 for close
connection_close: conn=1 sd=10
slapd: connection.c:687: connection_destroy: Assertion `c->c_writewaiter == 0'
failed.
Aborted


Other times the producer will hold open the socket which the consumer was
connected to.  The producer will accept queries and further connections but
syncrepl cosumers will no longer be able to get data.  After sending the
producer process a SIGTERM to shut it down the CPU pegs and the server will not
die until a SIGQUIT is sent.


Here is the debug from when the producer just hangs ->

connection_get(10): got connid=2
connection_write(10): waking output for id=2
connection_get(10): got connid=2
connection_write(10): waking output for id=2
ber_flush: 839 bytes to sd 10
<= send_search_entry
connection_get(10): got connid=2
connection_write(10): waking output for id=2


Here is the deubg from when the consumer hangs ->

<= key_change 0
=> key_change(ADD,88)
<= key_change 0
<= index_entry_add( 136, "uid=xxxx1,ou=people,o=some,dc=where,dc=far" ) succe
ss
bdb_add: added id=00000088 dn="uid=xxxx1,ou=people,o=some,dc=where,dc=far"
send_ldap_result: conn=4294967295 op=0 p=3
ldap_msgfree
ldap_result msgid -1
ldap_chkResponseList for msgid=-1, all=0
ldap_chkResponseList returns NULL
wait4msg (timeout 0 sec, 0 usec), msgid -1
wait4msg continue, msgid -1, all 0
** Connections:
* host: a.host.com  port: 636  (default)
  refcnt: 2  status: Connected
  last used: Tue Feb  8 01:35:52 2005

** Outstanding Requests:
 * msgid 2,  origid 2, status InProgress
   outstanding referrals 0, parent count 0
** Response Queue:
   Empty
ldap_chkResponseList for msgid=-1, all=0
ldap_chkResponseList returns NULL
ldap_int_select
connection_get(8): got connid=0