[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#3534) Syncrepl producer fails on abnormal exit from consumer



A workaround for this problem is in ITS#3546 followup #1.
We still have not decided on the permanent solution.

rwinslow@lbl.gov wrote:

>Full_Name: Roger Winslow
>Version: 2.2.23
>OS: Linux
>URL: ftp://ftp.openldap.org/incoming/
>Submission from: (NULL) (67.114.146.162)
>
>
>OpenLDAP version 2.2.23
>BerkeleyDB version 4.2.52.2
>OpenSSL version 0.9.7e
>
>When running a ldap server as a syncrepl producer and a syncrepl consumer is
>connected over a TLS channel and the consumer is terminated midway through a
>replication process the producer dies.  This problem has also been observed on
>Solaris 9.
>
>The consumers are using refreshAndPersist.  
>
>Here is the debug from the producer under log level 1 when it dies->
>
>connection_write(10): waking output for id=1
>connection_get(10): got connid=1
>connection_read(10): checking for input on id=1
>ber_get_next
>ber_get_next: tag 0x30 len 5 contents:
>connection_input: conn=1 deferring operation: awaiting write
>connection_get(10): got connid=1
>connection_write(10): waking output for id=1
>connection_get(10): got connid=1
>connection_read(10): checking for input on id=1
>ber_get_next
>TLS trace: SSL3 alert read:warning:close notify
>ber_get_next on fd 10 failed errno=0 (Success)
>connection_read(10): input error=-2 id=1, closing.
>connection_closing: readying conn=1 sd=10 for close
>connection_close: conn=1 sd=10
>slapd: connection.c:687: connection_destroy: Assertion `c->c_writewaiter == 0'
>failed.
>Aborted
>
>
>Other times the producer will hold open the socket which the consumer was
>connected to.  The producer will accept queries and further connections but
>syncrepl cosumers will no longer be able to get data.  After sending the
>producer process a SIGTERM to shut it down the CPU pegs and the server will not
>die until a SIGQUIT is sent.
>
>
>Here is the debug from when the producer just hangs ->
>
>connection_get(10): got connid=2
>connection_write(10): waking output for id=2
>connection_get(10): got connid=2
>connection_write(10): waking output for id=2
>ber_flush: 839 bytes to sd 10
><= send_search_entry
>connection_get(10): got connid=2
>connection_write(10): waking output for id=2
>
>
>Here is the deubg from when the consumer hangs ->
>
><= key_change 0
>=> key_change(ADD,88)
><= key_change 0
><= index_entry_add( 136, "uid=xxxx1,ou=people,o=some,dc=where,dc=far" ) succe
>ss
>bdb_add: added id=00000088 dn="uid=xxxx1,ou=people,o=some,dc=where,dc=far"
>send_ldap_result: conn=4294967295 op=0 p=3
>ldap_msgfree
>ldap_result msgid -1
>ldap_chkResponseList for msgid=-1, all=0
>ldap_chkResponseList returns NULL
>wait4msg (timeout 0 sec, 0 usec), msgid -1
>wait4msg continue, msgid -1, all 0
>** Connections:
>* host: a.host.com  port: 636  (default)
>  refcnt: 2  status: Connected
>  last used: Tue Feb  8 01:35:52 2005
>
>** Outstanding Requests:
> * msgid 2,  origid 2, status InProgress
>   outstanding referrals 0, parent count 0
>** Response Queue:
>   Empty
>ldap_chkResponseList for msgid=-1, all=0
>ldap_chkResponseList returns NULL
>ldap_int_select
>connection_get(8): got connid=0
>
>
>  
>


-- 
  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support