OpenLDAP
Up to top level
Build   Contrib   Development   Documentation   Historical   Incoming   Software Bugs   Software Enhancements   Web  

Logged in as guest

Viewing Incoming/6919
Full headers

From: jarl@dallur.com
Subject: "Weekly" crashes of OpenLDAP 2.4.22 using syncrepl master/master
Compose comment
Download message
State:
0 replies:
3 followups: 1 2 3

Major security issue: yes  no

Notes:

Notification:


Date: Thu, 28 Apr 2011 17:30:50 +0000
From: jarl@dallur.com
To: openldap-its@OpenLDAP.org
Subject: "Weekly" crashes of OpenLDAP 2.4.22 using syncrepl master/master
Full_Name: Jarl Stefansson
Version:  2.4.22 
OS: Centos 5.5
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (81.15.35.75)


Two servers running OpenLDAP in master/master using syncrepl, roughly once per
week one of the servers stops responding to new connections, I tried logging
with loglevel=1 and this is all I got, all suggestions and comments
appreciated.

This is most likely a "load" related problem since we never experience this
problem in the testlab, only production.:

Apr 23 20:36:37 ldap01 slapd2.4[32233]:
bdb_dn2entry("macaddress=1\2C6\2C00:00:00:00:00:01,ou=xxx,ou=xx,o=xxxx") 
Apr 23 20:36:37 ldap01 slapd2.4[32233]: =>
bdb_dn2id("macaddress=1\2C6\2C00:00:00:00:00:01,ou=xxx,ou=xx,o=xxxx") 
Apr 23 20:36:37 ldap01 slapd2.4[32233]: <= bdb_dn2id: get failed:
DB_NOTFOUND:
No matching key/data pair found (-30988) 
Apr 23 20:36:37 ldap01 slapd2.4[32233]: => bdb_dn2id_add 0x2fed7:
"macaddress=1\2C6\2C00:00:00:00:00:01,ou=xxx,ou=xx,o=xxxx" 
Apr 23 20:37:44 ldap01 slapd2.4[32233]: slap_listener_activate(8):  
Apr 23 21:35:42 ldap01 slapd2.4[32233]: connection_close: conn=1025 sd=16
Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1022 sd=17
Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1037 sd=18
Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1038 sd=20
Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1039 sd=25

Apr 24 12:02:23 ldap01 slapd2.4[32233]: daemon: shutdown requested and
initiated. I have two Centos 5 servers running Openldap 2.4.22 with
master/master syncrepl setup, 
Apr 24 12:02:23 ldap01 slapd2.4[32233]: connection_close: conn=1005 sd=21 
Apr 24 12:02:23 ldap01 slapd2.4[32233]: slapd shutdown: waiting for 59
operations/tasks to finish 
Apr 24 12:02:34 ldap01 slapd2.4[26503]: bdb_db_open: database "": unclean
shutdown detected; attempting recovery. 
Apr 24 12:02:35 ldap01 slapd2.4[26503]: slapd starting 


---------------------- Relevant Config
------------------------------------------------
moduleload     syncprov.la
TLSCertificateFile      /etc/pki/tls/private/ldap.pem
TLSCertificateKeyFile   /etc/pki/tls/private/ldap.pem
TLSCACertificateFile    /etc/pki/tls/private/ldap.pem

serverID        001
database        bdb
sizelimit              50000
cachesize 10000
checkpoint 256 5


syncrepl        rid=001
                provider=ldap://10.10.10.10:389
                bindmethod=simple
                binddn="uid=xxxxrep,ou=xxxxxx,o=xxxxxx"
                credentials=MyPassword
                searchbase="o=xxxx"
                schemachecking=on
                sizelimit="unlimited"
                timelimit="unlimited"
                type=refreshAndPersist
                interval=00:00:00:10
                retry="5 5 60 +"
                attrs="*,+"

mirrormode on
overlay syncprov
syncprov-checkpoint     100     10
syncprov-sessionlog     100

idletimeout 3600
threads 32

Followup 1

Download message
Date: Sat, 30 Apr 2011 21:01:26 -0700
From: Howard Chu <hyc@symas.com>
To: jarl@dallur.com
CC: openldap-its@openldap.org
Subject: Re: (ITS#6919) "Weekly" crashes of OpenLDAP 2.4.22 using syncrepl
 master/master
jarl@dallur.com wrote:
> Full_Name: Jarl Stefansson
> Version:  2.4.22
> OS: Centos 5.5
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (81.15.35.75)
>
>
> Two servers running OpenLDAP in master/master using syncrepl, roughly once
per
> week one of the servers stops responding to new connections, I tried
logging
> with loglevel=1 and this is all I got, all suggestions and comments
> appreciated.
>
> This is most likely a "load" related problem since we never experience this
> problem in the testlab, only production.:

Your log doesn't indicate any kind of crash. It only shows that a shutdown was 
requested, and apparently your startup scripts didn't wait for the shutdown to 
complete before starting again.

> Apr 23 20:36:37 ldap01 slapd2.4[32233]:
> bdb_dn2entry("macaddress=1\2C6\2C00:00:00:00:00:01,ou=xxx,ou=xx,o=xxxx")
> Apr 23 20:36:37 ldap01 slapd2.4[32233]: =>
> bdb_dn2id("macaddress=1\2C6\2C00:00:00:00:00:01,ou=xxx,ou=xx,o=xxxx")
> Apr 23 20:36:37 ldap01 slapd2.4[32233]:<= bdb_dn2id: get failed:
DB_NOTFOUND:
> No matching key/data pair found (-30988)
> Apr 23 20:36:37 ldap01 slapd2.4[32233]: =>  bdb_dn2id_add 0x2fed7:
> "macaddress=1\2C6\2C00:00:00:00:00:01,ou=xxx,ou=xx,o=xxxx"
> Apr 23 20:37:44 ldap01 slapd2.4[32233]: slap_listener_activate(8):
> Apr 23 21:35:42 ldap01 slapd2.4[32233]: connection_close: conn=1025 sd=16
> Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1022 sd=17
> Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1037 sd=18
> Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1038 sd=20
> Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1039 sd=25
>
> Apr 24 12:02:23 ldap01 slapd2.4[32233]: daemon: shutdown requested and
> initiated. I have two Centos 5 servers running Openldap 2.4.22 with
> master/master syncrepl setup,
> Apr 24 12:02:23 ldap01 slapd2.4[32233]: connection_close: conn=1005 sd=21
> Apr 24 12:02:23 ldap01 slapd2.4[32233]: slapd shutdown: waiting for 59
> operations/tasks to finish
> Apr 24 12:02:34 ldap01 slapd2.4[26503]: bdb_db_open: database "": unclean
> shutdown detected; attempting recovery.
> Apr 24 12:02:35 ldap01 slapd2.4[26503]: slapd starting

Your config is a mess, with global directives and DB-specific directives 
interleaved. That may not be the cause of any specific problems, but it shows 
sloppiness on the part of the sysadmins.

> ---------------------- Relevant Config
> ------------------------------------------------
> moduleload     syncprov.la
> TLSCertificateFile      /etc/pki/tls/private/ldap.pem
> TLSCertificateKeyFile   /etc/pki/tls/private/ldap.pem
> TLSCACertificateFile    /etc/pki/tls/private/ldap.pem
>
> serverID        001
> database        bdb
> sizelimit              50000
> cachesize 10000
> checkpoint 256 5
>
>
> syncrepl        rid=001
>                  provider=ldap://10.10.10.10:389
>                  bindmethod=simple
>                  binddn="uid=xxxxrep,ou=xxxxxx,o=xxxxxx"
>                  credentials=MyPassword
>                  searchbase="o=xxxx"
>                  schemachecking=on
>                  sizelimit="unlimited"
>                  timelimit="unlimited"
>                  type=refreshAndPersist
>                  interval=00:00:00:10
>                  retry="5 5 60 +"
>                  attrs="*,+"
>
> mirrormode on
> overlay syncprov
> syncprov-checkpoint     100     10
> syncprov-sessionlog     100
>
> idletimeout 3600
> threads 32
>
>
-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/



Followup 2

Download message
Date: Sun, 1 May 2011 22:45:40 +0000
Subject: Re: (ITS#6919) "Weekly" crashes of OpenLDAP 2.4.22 using syncrepl master/master
From: Jarl Stefansson <jarl@dallur.com>
To: Howard Chu <hyc@symas.com>
Cc: openldap-its@openldap.org
--000e0cd48c428f954504a23eaa22
Content-Type: text/plain; charset=UTF-8

First of all let me apologize for the state of the config.

As you say the logs don't show any crash yet the OpenLDAP server is not
accepting any new connections after 20:36:37, there should be hundreds of
connections being logged there but instead the server hangs completely and
nothing is logged, even on log level 1.

Perhaps someone could suggest a way of getting more information than what
loglevel 1 shows?

Would the only other step be a debug build?

Jarl

On Sun, May 1, 2011 at 4:01 AM, Howard Chu <hyc@symas.com> wrote:

> jarl@dallur.com wrote:
>
>> Full_Name: Jarl Stefansson
>> Version:  2.4.22
>> OS: Centos 5.5
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (81.15.35.75)
>>
>>
>> Two servers running OpenLDAP in master/master using syncrepl, roughly
once
>> per
>> week one of the servers stops responding to new connections, I tried
>> logging
>> with loglevel=1 and this is all I got, all suggestions and comments
>> appreciated.
>>
>> This is most likely a "load" related problem since we never experience
>> this
>> problem in the testlab, only production.:
>>
>
> Your log doesn't indicate any kind of crash. It only shows that a shutdown
> was requested, and apparently your startup scripts didn't wait for the
> shutdown to complete before starting again.
>
>
>  Apr 23 20:36:37 ldap01 slapd2.4[32233]:
>> bdb_dn2entry("macaddress=1\2C6\2C00:00:00:00:00:01,ou=xxx,ou=xx,o=xxxx")
>> Apr 23 20:36:37 ldap01 slapd2.4[32233]: =>
>> bdb_dn2id("macaddress=1\2C6\2C00:00:00:00:00:01,ou=xxx,ou=xx,o=xxxx")
>> Apr 23 20:36:37 ldap01 slapd2.4[32233]:<= bdb_dn2id: get failed:
>> DB_NOTFOUND:
>> No matching key/data pair found (-30988)
>> Apr 23 20:36:37 ldap01 slapd2.4[32233]: =>  bdb_dn2id_add 0x2fed7:
>> "macaddress=1\2C6\2C00:00:00:00:00:01,ou=xxx,ou=xx,o=xxxx"
>> Apr 23 20:37:44 ldap01 slapd2.4[32233]: slap_listener_activate(8):
>> Apr 23 21:35:42 ldap01 slapd2.4[32233]: connection_close: conn=1025
sd=16
>> Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1022
sd=17
>> Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1037
sd=18
>> Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1038
sd=20
>> Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1039
sd=25
>>
>> Apr 24 12:02:23 ldap01 slapd2.4[32233]: daemon: shutdown requested and
>> initiated. I have two Centos 5 servers running Openldap 2.4.22 with
>> master/master syncrepl setup,
>> Apr 24 12:02:23 ldap01 slapd2.4[32233]: connection_close: conn=1005
sd=21
>> Apr 24 12:02:23 ldap01 slapd2.4[32233]: slapd shutdown: waiting for 59
>> operations/tasks to finish
>> Apr 24 12:02:34 ldap01 slapd2.4[26503]: bdb_db_open: database "":
unclean
>> shutdown detected; attempting recovery.
>> Apr 24 12:02:35 ldap01 slapd2.4[26503]: slapd starting
>>
>
> Your config is a mess, with global directives and DB-specific directives
> interleaved. That may not be the cause of any specific problems, but it
> shows sloppiness on the part of the sysadmins.
>
>
>  ---------------------- Relevant Config
>> ------------------------------------------------
>> moduleload     syncprov.la
>> TLSCertificateFile      /etc/pki/tls/private/ldap.pem
>> TLSCertificateKeyFile   /etc/pki/tls/private/ldap.pem
>> TLSCACertificateFile    /etc/pki/tls/private/ldap.pem
>>
>> serverID        001
>> database        bdb
>> sizelimit              50000
>> cachesize 10000
>> checkpoint 256 5
>>
>>
>> syncrepl        rid=001
>>                 provider=ldap://10.10.10.10:389
>>                 bindmethod=simple
>>                 binddn="uid=xxxxrep,ou=xxxxxx,o=xxxxxx"
>>                 credentials=MyPassword
>>                 searchbase="o=xxxx"
>>                 schemachecking=on
>>                 sizelimit="unlimited"
>>                 timelimit="unlimited"
>>                 type=refreshAndPersist
>>                 interval=00:00:00:10
>>                 retry="5 5 60 +"
>>                 attrs="*,+"
>>
>> mirrormode on
>> overlay syncprov
>> syncprov-checkpoint     100     10
>> syncprov-sessionlog     100
>>
>> idletimeout 3600
>> threads 32
>>
>>
>>  --
>  -- Howard Chu
>  CTO, Symas Corp.           http://www.symas.com
>  Director, Highland Sun     http://highlandsun.com/hyc/
>  Chief Architect, OpenLDAP  http://www.openldap.org/project/
>



-- 
Regards

Jarl
jarl@dallur.com

--000e0cd48c428f954504a23eaa22
C

Message of length 13080 truncated


Followup 3

Download message
Date: Tue, 3 May 2011 10:07:18 +0000
Subject: Re: (ITS#6919) "Weekly" crashes of OpenLDAP 2.4.22 using syncrepl master/master
From: Jarl Stefansson <jarl@dallur.com>
To: hyc@symas.com
Cc: openldap-its@openldap.org
Sorry for the html mail, here is the mail again in plain text:

First of all let me..apologize..for the state of the config.

As you say the logs don't show any crash yet the OpenLDAP server is
not accepting any new connections after..20:36:37, there should be
hundreds of connections being logged there but instead the server
hangs completely and nothing is logged, even on log level 1.

Perhaps someone could suggest a way of getting more information than
what loglevel 1 shows?

Would the only other step be a debug build?

Jarl

On Sun, May 1, 2011 at 4:02 AM, <hyc@symas.com> wrote:
>
> jarl@dallur.com wrote:
> > Full_Name: Jarl Stefansson
> > Version: ..2.4.22
> > OS: Centos 5.5
> > URL: ftp://ftp.openldap.org/incoming/
> > Submission from: (NULL) (81.15.35.75)
> >
> >
> > Two servers running OpenLDAP in master/master using syncrepl, roughly
once per
> > week one of the servers stops responding to new connections, I tried
logging
> > with loglevel=1 and this is all I got, all suggestions and comments
> > appreciated.
> >
> > This is most likely a "load" related problem since we never experience
this
> > problem in the testlab, only production.:
>
> Your log doesn't indicate any kind of crash. It only shows that a shutdown
was
> requested, and apparently your startup scripts didn't wait for the shutdown
to
> complete before starting again.
>
> > Apr 23 20:36:37 ldap01 slapd2.4[32233]:
> > bdb_dn2entry("macaddress=1\2C6\2C00:00:00:00:00:01,ou=xxx,ou=xx,o=xxxx")
> > Apr 23 20:36:37 ldap01 slapd2.4[32233]: =>
> > bdb_dn2id("macaddress=1\2C6\2C00:00:00:00:00:01,ou=xxx,ou=xx,o=xxxx")
> > Apr 23 20:36:37 ldap01 slapd2.4[32233]:<= bdb_dn2id: get failed:
DB_NOTFOUND:
> > No matching key/data pair found (-30988)
> > Apr 23 20:36:37 ldap01 slapd2.4[32233]: => ..bdb_dn2id_add 0x2fed7:
> > "macaddress=1\2C6\2C00:00:00:00:00:01,ou=xxx,ou=xx,o=xxxx"
> > Apr 23 20:37:44 ldap01 slapd2.4[32233]: slap_listener_activate(8):
> > Apr 23 21:35:42 ldap01 slapd2.4[32233]: connection_close: conn=1025
sd=16
> > Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1022
sd=17
> > Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1037
sd=18
> > Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1038
sd=20
> > Apr 23 21:55:42 ldap01 slapd2.4[32233]: connection_close: conn=1039
sd=25
> >
> > Apr 24 12:02:23 ldap01 slapd2.4[32233]: daemon: shutdown requested and
> > initiated. I have two Centos 5 servers running Openldap 2.4.22 with
> > master/master syncrepl setup,
> > Apr 24 12:02:23 ldap01 slapd2.4[32233]: connection_close: conn=1005
sd=21
> > Apr 24 12:02:23 ldap01 slapd2.4[32233]: slapd shutdown: waiting for 59
> > operations/tasks to finish
> > Apr 24 12:02:34 ldap01 slapd2.4[26503]: bdb_db_open: database "":
unclean
> > shutdown detected; attempting recovery.
> > Apr 24 12:02:35 ldap01 slapd2.4[26503]: slapd starting
>
> Your config is a mess, with global directives and DB-specific directives
> interleaved. That may not be the cause of any specific problems, but it
shows
> sloppiness on the part of the sysadmins.
>
> > ---------------------- Relevant Config
> > ------------------------------------------------
> > moduleload .. .. syncprov.la
> > TLSCertificateFile .. .. ../etc/pki/tls/private/ldap.pem
> > TLSCertificateKeyFile .. /etc/pki/tls/private/ldap.pem
> > TLSCACertificateFile .. ../etc/pki/tls/private/ldap.pem
> >
> > serverID .. .. .. ..001
> > database .. .. .. ..bdb
> > sizelimit .. .. .. .. .. .. ..50000
> > cachesize 10000
> > checkpoint 256 5
> >
> >
> > syncrepl .. .. .. ..rid=001
> > .. .. .. .. .. .. .. .. ..provider=ldap://10.10.10.10:389
> > .. .. .. .. .. .. .. .. ..bindmethod=simple
> > .. .. .. .. .. .. .. .. ..binddn="uid=xxxxrep,ou=xxxxxx,o=xxxxxx"
> > .. .. .. .. .. .. .. .. ..credentials=MyPassword
> > .. .. .. .. .. .. .. .. ..searchbase="o=xxxx"
> > .. .. .. .. .. .. .. .. ..schemachecking=on
> > .. .. .. .. .. .. .. .. ..sizelimit="unlimited"
> > .. .. .. .. .. .. .. .. ..timelimit="unlimited"
> > .. .. .. .. .. .. .. .. ..type=refreshAndPersist
> > .. .. .. .. .. .. .. .. ..interval=00:00:00:10
> > .. .. .. .. .. .. .. .. ..retry="5 5 60 +"
> > .. .. .. .. .. .. .. .. ..attrs="*,+"
> >
> > mirrormode on
> > overlay syncprov
> > syncprov-checkpoint .. .. 100 .. .. 10
> > syncprov-sessionlog .. .. 100
> >
> > idletimeout 3600
> > threads 32
> >
> >
> --
> .. -- Howard Chu
> .. CTO, Symas Corp. .. .. .. .. .. http://www.symas.com
> .. Director, Highland Sun .. .. http:/

Message of length 5138 truncated

Up to top level
Build   Contrib   Development   Documentation   Historical   Incoming   Software Bugs   Software Enhancements   Web  

Logged in as guest


The OpenLDAP Issue Tracking System uses a hacked version of JitterBug

______________
© Copyright 2013, OpenLDAP Foundation, info@OpenLDAP.org