[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: 2.4.44 + ITS 8432 patch segfault in modify_add_values



On Thu, Feb 16, 2017 at 03:53:40PM -0800, Quanah Gibson-Mount wrote:

> Yeah, so this is the operation that actually failed... It'd be interesting 
> to know if it succeeded in the primary DB, but failed when writing to the 
> accesslog DB (I.e., the master and its consumers are now out of sync for 
> that entry), or if the entire write op failed (master and consumers are in 
> sync for the entry)

I had another crash today :(. The logs leading up to the crash were:

Mar  4 03:53:52 themis slapd[27421]: conn=1668323 op=145931 MOD dn="uid=mlmungia,ou=user,dc=cpp,dc=edu"
Mar  4 03:53:52 themis slapd[27421]: conn=1668323 op=145931 MOD attr=eduPersonAffiliation eduPersonPrimaryAffi
liation cppEduPersonAffiliation
Mar  4 03:53:52 themis slapd[27421]: conn=1668323 op=145931 RESULT tag=103 err=0 text=
Mar  4 03:53:52 themis slapd[27421]: conn=1668323 op=145932 MOD dn="uid=mlmungia,ou=user,dc=cpp,dc=edu"
Mar  4 03:53:52 themis slapd[27421]: conn=1668323 op=145932 MOD attr=cppEduPersonExpiration
Mar  4 03:53:52 themis slapd[27421]: conn=1668323 op=145932 RESULT tag=103 err=0 text=
Mar  4 03:53:52 themis slapd[27421]: conn=1668323 op=145933 MOD dn="uid=mlmungia,ou=user,dc=cpp,dc=edu"
Mar  4 03:53:52 themis slapd[27421]: conn=1668323 op=145933 MOD attr=cppEduPersonStatusFlag
Mar  4 03:53:52 themis slapd[27421]: conn=1668323 op=145933 RESULT tag=103 err=0 text=

Interesting, this time there was no failed operation on the client side.
All three of these opereration succeeded, then the server died and the
load balancer cut over to the secondary. The client got some errors
while it was trying to initiate the next operation it wanted to do, but
it retried a couple of times and successfully performed it on the
secondary once the load balancer cut over.

When I brought the primary back online, all three of these operations
were reflected in its local database and had been replicated to the
other servers.

The backtrace was the same:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000485c61 in modify_add_values (e=e@entry=0x7f417d5b8480, mod=mod@entry=0x7f4170000980, 
    permissive=0, text=text@entry=0x7f417d5b89a0, 
    textbuf=textbuf@entry=0x7f417d5b84d0 "modify/delete: eduPersonAffiliation: no such attribute", 
    textlen=textlen@entry=256)
    at /var/lib/portage/tmp/portage/net-nds/openldap-2.4.44-r1/work/openldap-2.4.44/servers/slapd/mods.c:61
61              if ( !BER_BVISNULL( &mod->sm_values[mod->sm_numvals] )) {


It seems like this has something to do with duplicated operations? The
first time this operation was applied, the eduPersonAffiliation
attribute did exist, with a single value that was removed. Then for some
reason it's like it tried to remove it again and went boom? But there
was no operation that *actually* failed. Everything that the client
tried to do was done.

BTW, I opened the ITS for tracking, it's #8609. Thanks...