[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: (ITS#9015) Replication goes haywire querying promoted master
- To: openldap-its@OpenLDAP.org
- Subject: Re: (ITS#9015) Replication goes haywire querying promoted master
- From: hyc@symas.com
- Date: Tue, 23 Apr 2019 17:07:54 +0000
- Auto-submitted: auto-generated (OpenLDAP-ITS)
quanah@openldap.org wrote:
> Full_Name: Quanah Gibson-Mount
> Version: 2.4.47
> OS: N/A
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (47.208.144.40)
>
>
> In testing a particular use case/setup scenario, I found that it's possible to
> cause a replica to slam a provider with unending requests. In this specific
> case, I was setting up delta-syncrepl MMR, but I believe the issue applies to
> standard syncrepl, and is not MMR specific. The scenario looks like this:
>
> Initially we have a stand alone server, which no overlays in place. The
> configuration is done via cn=config, which allows for us to update the
> configuration without a server restart.
>
> The configuration is modified to load the syncprov and accesslog overlays,
> create a new accesslog database, and to send all change data to the accesslog
> db.
>
> After that is done, a secondary server is brought online with the same
> configuration other than the serverID being different and the syncrepl statement
> adjusted.
>
> When the secondary server is started, it pummels the initial provider with
> queries like:
>
> Apr 23 06:39:06 anvil4 slapd[28967]: conn=1003 op=361868131 SRCH
> base="dc=example,dc=com" scope=2 deref=0 filter="(objectClass=*)"
> Apr 23 06:39:06 anvil4 slapd[28967]: conn=1003 op=361868131 SRCH attr=* +
> Apr 23 06:39:06 anvil4 slapd[28967]: conn=1003 op=361868131 SEARCH RESULT
> tag=101 err=0 nentries=0 text=
>
> (Averaging around 2000 queries/second on my server per syncrepl client).
>
> I believe the problem is that the root entry for the database contains no
> contextCSN. This is likely due to the fact that:
>
> a) There was never a syncprov overlay present until I loaded this one in
> b) The serverID was set prior to the syncprov overlay being loaded (So it went
> from "0" to "1", with no changes ever recorded for "1").
>
> Now there is a trivial ways to handle this, by making a change on the provider
> prior to starting up the other servers.
>
> However, I think the overall behavior is undesirable. If there is no contextCSN
> present, it should not lead to replication clients executing a potential DoS on
> the provider. It also generated ~60GB of logs at loglevel stats in 1 day.
The consumer should not be reconnecting more frequently than its retry config.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/