moved from Incoming to Software Bugs
changed notes changed state Open to Test
Full_Name: James Tavares Version: 1.2.12 OS: Solaris 7 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (24.218.235.205) This weekend we witnessed the historic change from 999,999,999 seconds to 1,000,000,000 in the unix seconds-since-epoch counter. At the same time, all replication on my systems ceased to exist. Running slurpd in debug mode claimed that the records in my slurpd.replog were OLDER than what my slurpd.status showed, which was easily verified as FALSE... My slurpd.status hadn't been updated since 999,999,942 (verified to be TRUE against the file dates on the slave servers) and records in my slurpd.replog were dated 1,000,000,000 and beyond. I grep'd for the "skip repl record for" error message and found it in slurpd/ri.c. As you all know, the function "ri.c:isnew()" is responsible for determining if a REPLOG entry is newer than the last update recorded in SLURPD.STATUS. However, the function uses strcmp() to compare the two timestamps. Since strcmp() compares alphabetically, it makes sense that slurpd was reporting 1,000,000,000 to be older than 999,999,942. A simple (and working) patch follows: 24a25 > #include <math.h> 229d229 < int x; 234,235c234 < x = strcmp( re->re_timestamp, ri->ri_stel->last ); < if ( x > 0 ) { --- > if ( atof(re->re_timestamp) > atof(ri->ri_stel->last) ) { 238c237 < } else if ( x < 0 ) { --- > } else if ( atof(re->re_timestamp) < atof(ri->ri_stel->last) ) { I am not very familiar with openldap source code (this is the first time I've looked through it) so to be sure I didn't miss anything I used atof() instead of atoi().. I remember seeing a ".0" on a timestamp once in slurpd.replog, and while I guess that could be the sequence number referenced later in the function as opposed to some fractional-second, I didn't want to take any chances. Feel free to correct it to atoi() if re->re_timestamp and ri->ri_stel->last are guaranteed to be integers (whole seconds from time()). Also, I have no idea where ELSE in slapd / slurpd source code this type of bug may exist. I just happened to locate this one while trying to get my replication working :) Which, by the way, this patch (at least) seems to have fixed. Any questions? Just bounce me an email: jtavares@loa.com.
changed notes changed state Test to Release
changed notes
I committed changes to OPENLDAP_REL_ENG_2 which should resolve this issue. http://www.openldap.org/software/repo.html Please test, thanks.
changed notes changed state Release to Closed
From the release notes: > OpenLDAP 2.0.14 Release > Fixed slurpd millionth second bug (ITS#1323) Shouldn't that be the billionth second? -- Norbert Klasen DAASI International GmbH phone: +49 7071 29 70336 Wilhelmstr. 106 fax: +49 7071 29 5114 72074 Tübingen email: norbert.klasen@daasi.de Germany web: http://www.daasi.de
moved from Software Bugs to Archive.Software Bugs
fixed in HEAD fixed in re20