[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: On behalf of "Perece" <ntb@mts.ru>: more bugs in back-meta (ITS#2310)



Greetings.

Pierangelo Masarati wrote:
> 
> > I didn't try to stop second slapd on (a)
> > But when I stop slapd on (b) slapd-meta on (a) core dumps
> > on first search involving dc=sub,dc=suffix
> 
> I cannot reproduce this; what I get in this case is
> 
> # extended LDIF
> #
> # LDAPv3
> # base <dc=spb,dc=mts,dc=ru> with scope sub
> # filter: (objectclass=*)
> # requesting: ALL
> #
> 
> # search result
> search: 2
> result: 32 No such object
> matchedDN: dc=mts,dc=ru
> 
> # numResponses: 1
> 
> > if I try to search, say, ou=something,dc=suffix against slapd-meta
> > it will handle such request even if slapd on (b) is down.
> > also, stopping and then immediately starting slapd on (b)
> > doesn't harm anything if no searches destined to it
> > performed on slapd-meta between stop and start slpad (b)
> > (altough I noticed persistent connections between slapd-meta
> > and targets, it reestabilishes them well if server available).
> 
> same applies to my system.  When a target server is down,
> it is not searched, that's it.
> 
> You should provide a stack backtrace of the failure
> (from current HEAD code).

As bdb detection now fixed in HEAD, I can do such test...
I got same result with current HEAD code. Here is parts
of truss and pstack outputs:

---
13664/6:        poll(0xFD3FF258, 1, -1)                         = 1
13664/6:        getpeername(9, 0xFD401168, 0xFD401164, 1)       Err#134
13664/6:        read(9, 0xFD401163, 1)                          Err#146
USED
13664/6:        shutdown(9, 2, 1)                               Err#134
13664/6:        close(9)                                        = 0
13664/6:            Incurred fault #6, FLTBOUNDS  %pc = 0xFF133144
13664/6:              siginfo: SIGSEGV SEGV_MAPERR addr=0x00000000
13664/6:            Received signal #11, SIGSEGV [default]
13664/6:              siginfo: SIGSEGV SEGV_MAPERR addr=0x00000000
---

so, we need 6th thread's stack:

---
-----------------  lwp# 6 / thread# 6  --------------------
 ff133144 strlen   (0, fd401870, 0, bb9a1, 0, a204c) + 80
 ff185014 vsnprintf (fd400810, 7fffffff, a2020, fd401870, 5fafc, 0) + 5c
 0008d894 lutil_debug (fd400810, ffffffff, a2020, 0, a4e30, a1f70) + 34
 0005e350 meta_back_dobind (caa88, cb5d0, cb5d0, 8, fd401c00, 0) + fc
 000590a8 meta_back_search (fd401bf8, 10ad68, cb5d0, fd401c08, fd401c00,
caa88) + 78
 00022fe8 do_search (0, cb5d0, 91400, b9800, bbc00, fd4019dc) + aa4
 000215c0 ???????? (cc4a0, cadd0, 21424, 0, 0, 0)
 0006f8e8 ???????? (be008, ff083d10, 1, ff0fad94, 0, 2)
 ff0db734 _thread_start (be008, 0, 0, 0, 0, 0) + 40
---

As I can guess, strlen is called with NULL-pointer argument.
It seems vsnprintf called with "%s" format and NULL in corresponding
arg.
or it may be internal Solaris bug fixed with some patch later...
I take look through sources... only strlen calls after vsnprintf can be
from within _doprnt, and they are against format string. but format
seems
is not null in this case... I can't say much on this.

SMTP /Perece/.