Full_Name: Brandon Hume Version: 2.4.28 OS: RHEL EL6.1, kernel 2.6.32-131.12.1.el6.x86_64 URL: http://den.bofh.ca/~hume/ol_v3_fail_syslog.txt Submission from: (NULL) (2001:410:a010:2:223:aeff:fe74:400e) OpenLDAP 2.4.28 compiled with BerkeleyDB 5.2.36, attempting to configure for multi-master replication. Problems occurred with the second server complaining about operations errors while attempting to replicate the cn=config tree; I've been able to reproduce the problem from the command line using ldapsearch from the same compile. The host is RHEL 6.1 running inside VMWare. Version of OpenLDAP is 2.4.28, compiled with BerkeleyDB 5.2.36 (no patches available from Oracle at the time I downloaded). Configuration options used are visible in http://den.bofh.ca/~hume/config_ol_sh.txt The problem does *not* seem to be consistent, as *sometimes* it will work but most of the time not. I've run the example query once, had it fail, run it again immediately and had it succeed, and then again fail a dozen times in a row. Wireshark traces showed the connection was established using LDAPv3; putting the server in debug "Any" mode also confirms the same. The syslogs generated are viewable at http://den.bofh.ca/~hume/ol_v3_fail_syslog.txt I've compiled 2.4.27 with the same options and it *seemed* to behave, though I may have merely gotten lucky. Example ldapsearch invocation: $ /appl/ldap/bin/ldapsearch -x -D 'cn=config' -h kil-ds-3.its.dal.ca -b cn=config -w bindpass -s sub -E sync=rp -e manageDSAit 'objectclass=*' # extended LDIF # # LDAPv3 # base <cn=config> with scope subtree # filter: objectclass=* # requesting: ALL # with manageDSAit control # # search result search: 2 result: 2 Protocol error text: controls require LDAPv3 # numResponses: 1 ldap_result: Can't contact LDAP server (-1) These servers are not yet in production and I have ~two weeks to play and tweak to generate more debugging information if you'd find it useful.
hume-ol@bofh.ca wrote: > Full_Name: Brandon Hume > Version: 2.4.28 > OS: RHEL EL6.1, kernel 2.6.32-131.12.1.el6.x86_64 > URL: http://den.bofh.ca/~hume/ol_v3_fail_syslog.txt > Submission from: (NULL) (2001:410:a010:2:223:aeff:fe74:400e) > > > OpenLDAP 2.4.28 compiled with BerkeleyDB 5.2.36, attempting to configure for > multi-master replication. Problems occurred with the second server complaining > about operations errors while attempting to replicate the cn=config tree; I've > been able to reproduce the problem from the command line using ldapsearch from > the same compile. > > The host is RHEL 6.1 running inside VMWare. Version of OpenLDAP is 2.4.28, > compiled with BerkeleyDB 5.2.36 (no patches available from Oracle at the time I > downloaded). Configuration options used are visible in > http://den.bofh.ca/~hume/config_ol_sh.txt > > The problem does *not* seem to be consistent, as *sometimes* it will work but > most of the time not. I've run the example query once, had it fail, run it > again immediately and had it succeed, and then again fail a dozen times in a > row. Wireshark traces showed the connection was established using LDAPv3; > putting the server in debug "Any" mode also confirms the same. The syslogs > generated are viewable at http://den.bofh.ca/~hume/ol_v3_fail_syslog.txt > > I've compiled 2.4.27 with the same options and it *seemed* to behave, though I > may have merely gotten lucky. Only two source files are changed between 2.4.27 and 2.4.28, and neither of the changes affect cn=config, syncrepl, or Bind operations. If you're seeing a difference in behavior here then I can only conclude that your build environment is corrupted. > Example ldapsearch invocation: > > $ /appl/ldap/bin/ldapsearch -x -D 'cn=config' -h kil-ds-3.its.dal.ca -b > cn=config -w bindpass -s sub -E sync=rp -e manageDSAit 'objectclass=*' > # extended LDIF > # > # LDAPv3 > # base<cn=config> with scope subtree > # filter: objectclass=* > # requesting: ALL > # with manageDSAit control > # > > # search result > search: 2 > result: 2 Protocol error > text: controls require LDAPv3 > > # numResponses: 1 > ldap_result: Can't contact LDAP server (-1) > > > These servers are not yet in production and I have ~two weeks to play and tweak > to generate more debugging information if you'd find it useful. > > -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
changed state Open to Feedback
I have reproduced the problem on 2.4.27. The mentioned ldapsearch query seems to have a high probability of success if run immediately after server start, which also seems true on 2.4.28. I threw in an extra Debug() line and op->op_protocol == 2 at the check. I'm rebuilding with as many debugging symbols enabled as possible to try to figure out what's going on.
changed notes changed state Feedback to Test moved from Incoming to Software Bugs
hume-ol@bofh.ca wrote: > I have reproduced the problem on 2.4.27. The mentioned ldapsearch query > seems to have a high probability of success if run immediately after > server start, which also seems true on 2.4.28. > > I threw in an extra Debug() line and op->op_protocol == 2 at the check. > > I'm rebuilding with as many debugging symbols enabled as possible to try > to figure out what's going on. A fix for this is in git master, please test. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
On 12/15/11 11:14 PM, Howard Chu wrote: > > A fix for this is in git master, please test. > Built and deployed, I've allowed to run for 72h and problem has yet to show. I believe this particular bug has been squashed.
changed notes changed state Test to Release
changed notes changed state Release to Closed
fixed in master fixed in RE24