Full_Name: Robert Brooks Version: 2 OS: 2.4.12 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (62.49.19.14) Set up is a pair of centos 4 hosts, one acts as primary, the other is a syncrepl consumer and mysql slave. Apart from syncrepl config/myqsl slave all config and software versions are the same. Mysql queries on both hosts are to localhost via socket. After a period of operation (24-48 hours) the slave host drops the back-sql part of directory for some queries (same query will suceed some times). Using strace it can been seen that the ldap searches are not resulting in queries to mysql. Back-sql is a subordinate db and results from hdb backed part of directory are returned. Restarting slapd fixes the problem, but start-up reports the shutdown was unclean on hdb db. Operation of primary is reliable.
changed notes
strace from failing slapd process 13637 send(3, "<167>Nov 5 10:34:39 slapd[13628"..., 117, MSG_NOSIGNAL) = 117 13637 semop(22052874, 0xb53ad7d0, 2) = 0 13637 semop(22052874, 0xb53ad7e0, 1) = 0 13637 time(NULL) = 1225881279 13637 semop(22052874, 0xb53ad750, 2) = 0 13637 semop(22052874, 0xb53ad760, 1) = 0 13637 write(57, "\27\3\1\0 \334@\'\353]R\251-Q\251Dd>0\376&\227\326\4\222dU\t\266/\356\177"..., 90) = 90 13637 time(NULL) = 1225881279 13637 stat64("/etc/localtime", {st_dev=makedev(3, 3), st_ino=1032239, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=16, st_size= 1323, st_atime=2008/11/05-10:31:17, st_mtime=2007/08/09-19:54:26, st_ctime=2007/08/09-19:54:26}) = 0 13637 stat64("/etc/localtime", {st_dev=makedev(3, 3), st_ino=1032239, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=16, st_size= 1323, st_atime=2008/11/05-10:31:17, st_mtime=2007/08/09-19:54:26, st_ctime=2007/08/09-19:54:26}) = 0 13637 stat64("/etc/localtime", {st_dev=makedev(3, 3), st_ino=1032239, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=16, st_size= 1323, st_atime=2008/11/05-10:31:17, st_mtime=2007/08/09-19:54:26, st_ctime=2007/08/09-19:54:26}) = 0 13637 send(3, "<167>Nov 5 10:34:39 slapd[13628"..., 76, MSG_NOSIGNAL) = 76 looks like it connect to mysql but never queries it. any way I can capture something useful with gdb?
can re-create this problem by restarting mysql. Following a restart about one in five queries fails on back-sql portion of directory.
robb@wtg.cw.com wrote: > Full_Name: Robert Brooks > Version: 2 > OS: 2.4.12 > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (62.49.19.14) > > > Set up is a pair of centos 4 hosts, one acts as primary, the other is a syncrepl > consumer and mysql slave. Apart from syncrepl config/myqsl slave all config and > software versions are the same. Mysql queries on both hosts are to localhost via > socket. > > After a period of operation (24-48 hours) the slave host drops the back-sql part > of directory for some queries (same query will suceed some times). Using strace > it can been seen that the ldap searches are not resulting in queries to mysql. > Back-sql is a subordinate db and results from hdb backed part of directory are > returned. > > Restarting slapd fixes the problem, but start-up reports the shutdown was > unclean on hdb db. Operation of primary is reliable. I have no possibility to setup a back-sql with mysql right now. Can you post your configuration? Can you post some (verbose) logs of operations affecting back-sql (e.g. with BACKSQL_TRACE manually #defined in back-sql/back-sql.h)? Can you step thru a search into backsql_search() and see if it gets called and why it doesn't get to contacting the rdbms? p. Ing. Pierangelo Masarati OpenLDAP Core Team SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it ----------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Fax: +39 0382 476497 Email: ando@sys-net.it -----------------------------------
Pierangelo Masarati wrote: > I have no possibility to setup a back-sql with mysql right now. Can you > post your configuration? Can you post some (verbose) logs of operations > affecting back-sql (e.g. with BACKSQL_TRACE manually #defined in > back-sql/back-sql.h)? Can you step thru a search into backsql_search() > and see if it gets called and why it doesn't get to contacting the rdbms? I'm compiling with BACKSQL_TRACE defined right now. After further prodding I have discovered... ejabberd (a jabber daemon) makes connections to ldap, which it keeps open indefinitely. It appears mysql's odbc connector will time out an idle mysql connection after 8 hours. Since this is the slave system it's entirely possible for it not touch mysql connections for at least 8 hours. If we restart mysql some ldap queries fail on back-sql portion until we kill ejabberd (to close it's ldap connections). I'll give you logs with tracing enabled asap, but this looks like a big clue as to what is happening.
Robert Brooks wrote: > Pierangelo Masarati wrote: >> I have no possibility to setup a back-sql with mysql right now. Can >> you post your configuration? Can you post some (verbose) logs of >> operations affecting back-sql (e.g. with BACKSQL_TRACE manually >> #defined in back-sql/back-sql.h)? Can you step thru a search into >> backsql_search() and see if it gets called and why it doesn't get to >> contacting the rdbms? > > I'm compiling with BACKSQL_TRACE defined right now. > > After further prodding I have discovered... > > ejabberd (a jabber daemon) makes connections to ldap, which it keeps > open indefinitely. It appears mysql's odbc connector will time out an > idle mysql connection after 8 hours. Since this is the slave system it's > entirely possible for it not touch mysql connections for at least 8 hours. > > If we restart mysql some ldap queries fail on back-sql portion until we > kill ejabberd (to close it's ldap connections). > > I'll give you logs with tracing enabled asap, but this looks like a big > clue as to what is happening. OK, this makes sense. In fact, back-sql does not seem to specifically handle broken connections to the odbc. It appears it would need something similar to idle-timeout as in back-ldap and back-meta. p. Ing. Pierangelo Masarati OpenLDAP Core Team SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it ----------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Fax: +39 0382 476497 Email: ando@sys-net.it -----------------------------------
Pierangelo Masarati wrote: > OK, this makes sense. In fact, back-sql does not seem to specifically > handle broken connections to the odbc. It appears it would need > something similar to idle-timeout as in back-ldap and back-meta. ok, this looks like it... Nov 11 17:01:38 nops slapd[24733]: backsql_dn2id("dc=sql,dc=example,dc=com") : error executing query ("SELECT id,keyval,oc_map_id,dn FROM ldap_entries WHERE dn=?", "dc=sql,dc=example,dc=com"): Nov 11 17:01:38 nops slapd[24733]: Return code: -1 Nov 11 17:01:38 nops slapd[24733]: nativeErrCode=2006 SQLengineState=08S01 msg="[unixODBC][MySQL][ODBC 3.51 Driver][mysqld-5.0.67-log]MySQL server has gone away" Nov 11 17:01:38 nops slapd[24733]: <==backsql_dn2id("dc=sql,dc=example,dc=com") : err=80 if you'd like a tarball of full logs + config just say. Regards, Rob
changed notes moved from Incoming to Software Bugs
Pierangelo Masarati wrote: > OK, this makes sense. In fact, back-sql does not seem to specifically > handle broken connections to the odbc. It appears it would need > something similar to idle-timeout as in back-ldap and back-meta. current work-around is to set "idletimeout" in openldap shorter than mysql's "wait_timeout" (default 8hrs).
Pierangelo Masarati wrote: [...] > OK, this makes sense. In fact, back-sql does not seem to specifically > handle broken connections to the odbc. It appears it would need > something similar to idle-timeout as in back-ldap and back-meta. any luck with this? Regards, Rob
Robert Brooks wrote: > any luck with this? Not even in my priority list right now, sorry. A patch would be welcome. p. Ing. Pierangelo Masarati OpenLDAP Core Team SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it ----------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Fax: +39 0382 476497 Email: ando@sys-net.it -----------------------------------
http://www.openldap.org/lists/openldap-technical/201009/msg00070.html While googling for reconnect in the ODBC API I found this: http://dev.mysql.com/doc/refman/5.0/en/connector-odbc-configuration-connection-parameters.html option 4194304: Enables auto-reconnection functionality. It seems like it fixes my problem.
back-sql needs retry
patches welcome