Full_Name: Jonathan Clarke Version: RE24 OS: irrelevant URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (213.41.243.192) Hi, When adding a syncrepl config, the function add_syncrepl performs a "check if URL points to current server". This check is based on an exact match between the provider parameter from the syncrepl config line, and the URIs given to slapd on startup. If this doesn't match when it should, the database is marked as a shadow, and all following updates fail with "shadow context; no update refs". This is quite a pain when it happens on cn=config :) There are multiple cases when this happens: 1) If no specific URI was specified on launch (no -h option) 2) Port numbers are explicitly specified or not (":389") 3) Trailing slash (for example "ldap://1.2.3.4" != "ldap://1.2.3.4/") 4) IP is specified rather than DNS name ("ldap://localhost" != "ldap://127.0.0.1") I saw the comment in the code that clarifies this behaviour. However, it's a surprising behaviour, and I think there is code to parse this kind of thing in the serverID detection now. Maybe it could be reused? Otherwise, we should probably document this behaviour, to avoid headaches :)
jclarke@linagora.com wrote: > Full_Name: Jonathan Clarke > Version: RE24 > OS: irrelevant > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (213.41.243.192) > > > Hi, > > When adding a syncrepl config, the function add_syncrepl performs a "check if > URL points to current server". This check is based on an exact match between the > provider parameter from the syncrepl config line, and the URIs given to slapd on > startup. > > If this doesn't match when it should, the database is marked as a shadow, and > all following updates fail with "shadow context; no update refs". This is quite > a pain when it happens on cn=config :) > > There are multiple cases when this happens: > 1) If no specific URI was specified on launch (no -h option) > 2) Port numbers are explicitly specified or not (":389") > 3) Trailing slash (for example "ldap://1.2.3.4" != "ldap://1.2.3.4/") > 4) IP is specified rather than DNS name ("ldap://localhost" != > "ldap://127.0.0.1") > > I saw the comment in the code that clarifies this behaviour. However, it's a > surprising behaviour, and I think there is code to parse this kind of thing in > the serverID detection now. Maybe it could be reused? > > Otherwise, we should probably document this behaviour, to avoid headaches :) The manpage says the serverID URL must use an FQDN. We already do a number of guesses in the code, I don't see any reason to extend this further. 1) with no URI the listener will default to localhost. The serverID URL should therefore refer to localhost, or omit the hostname. 2) Port numbers shouldn't be an issue, since they're always matched in the parsed URLs. 3) Trailing slashes don't matter in the parsed URLs. 4) The doc is quite explicit about using FQDN. I have no sympathy for people who trip over this. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
hyc@symas.com wrote: > jclarke@linagora.com wrote: >> Full_Name: Jonathan Clarke >> Version: RE24 >> OS: irrelevant >> URL: ftp://ftp.openldap.org/incoming/ >> Submission from: (NULL) (213.41.243.192) >> >> >> Hi, >> >> When adding a syncrepl config, the function add_syncrepl performs a "check if >> URL points to current server". This check is based on an exact match between the >> provider parameter from the syncrepl config line, and the URIs given to slapd on >> startup. >> >> If this doesn't match when it should, the database is marked as a shadow, and >> all following updates fail with "shadow context; no update refs". This is quite >> a pain when it happens on cn=config :) >> >> There are multiple cases when this happens: >> 1) If no specific URI was specified on launch (no -h option) >> 2) Port numbers are explicitly specified or not (":389") >> 3) Trailing slash (for example "ldap://1.2.3.4" != "ldap://1.2.3.4/") >> 4) IP is specified rather than DNS name ("ldap://localhost" != >> "ldap://127.0.0.1") >> >> I saw the comment in the code that clarifies this behaviour. However, it's a >> surprising behaviour, and I think there is code to parse this kind of thing in >> the serverID detection now. Maybe it could be reused? Oops, never mind. I was thinking of the serverID code, and you're talking about the syncrepl code. I'll have to think a bit about why the two aren't behaving identically; probably they should both use the same code. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Howard Chu wrote: > hyc@symas.com wrote: >> jclarke@linagora.com wrote: >>> I saw the comment in the code that clarifies this behaviour. However, it's a >>> surprising behaviour, and I think there is code to parse this kind of thing in >>> the serverID detection now. Maybe it could be reused? > > Oops, never mind. I was thinking of the serverID code, and you're talking > about the syncrepl code. > > I'll have to think a bit about why the two aren't behaving identically; > probably they should both use the same code. > Ah right. For the serverID code, we're doing a liberal match because anything that matches the current name:port is obviously the current server, and we want it to be identified as such. For syncrepl, we know full well that it may be the current server, but for whatever reason you may want to replicate against yourself anyway (e.g., against a different baseDN, etc...). -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Le Jeu 12 février 2009 20:41, Howard Chu a écrit : > jclarke@linagora.com wrote: >> Full_Name: Jonathan Clarke >> Version: RE24 >> OS: irrelevant >> URL: ftp://ftp.openldap.org/incoming/ >> Submission from: (NULL) (213.41.243.192) >> >> >> Hi, >> >> When adding a syncrepl config, the function add_syncrepl performs a >> "check if >> URL points to current server". This check is based on an exact match >> between the >> provider parameter from the syncrepl config line, and the URIs given to >> slapd on >> startup. >> >> If this doesn't match when it should, the database is marked as a >> shadow, and >> all following updates fail with "shadow context; no update refs". This >> is quite >> a pain when it happens on cn=config :) >> >> There are multiple cases when this happens: >> 1) If no specific URI was specified on launch (no -h option) >> 2) Port numbers are explicitly specified or not (":389") >> 3) Trailing slash (for example "ldap://1.2.3.4" != "ldap://1.2.3.4/") >> 4) IP is specified rather than DNS name ("ldap://localhost" != >> "ldap://127.0.0.1") >> >> I saw the comment in the code that clarifies this behaviour. However, >> it's a >> surprising behaviour, and I think there is code to parse this kind of >> thing in >> the serverID detection now. Maybe it could be reused? >> >> Otherwise, we should probably document this behaviour, to avoid >> headaches :) > > The manpage says the serverID URL must use an FQDN. We already do a number > of > guesses in the code, I don't see any reason to extend this further. I'm sorry, but I was not referring to the serverID URI matching. I'm referring to syncrepl provider matching to listeners. I mentioned the serverID matching since it seems to work like syncrepl provider matching should. Regards, Jonathan
hyc@symas.com wrote: > Howard Chu wrote: > >> hyc@symas.com wrote: >> >>> jclarke@linagora.com wrote: >>> >>>> I saw the comment in the code that clarifies this behaviour. However, it's a >>>> surprising behaviour, and I think there is code to parse this kind of thing in >>>> the serverID detection now. Maybe it could be reused? >>>> >> Oops, never mind. I was thinking of the serverID code, and you're talking >> about the syncrepl code. >> >> I'll have to think a bit about why the two aren't behaving identically; >> probably they should both use the same code. >> >> > Ah right. For the serverID code, we're doing a liberal match because anything > that matches the current name:port is obviously the current server, and we > want it to be identified as such. > > For syncrepl, we know full well that it may be the current server, but for > whatever reason you may want to replicate against yourself anyway (e.g., > against a different baseDN, etc...). > Actually, this comes straight from test049 (config replication). With multimaster config replication, it starts by replicating itself (useless, but necessary for other masters). This problem doesn't appear in the test case, since a variable ($URI1) is used for both the slapd -h $URI1 option, and in the syncrepl provider=$URI1 LDIF. Otherwise it would produce weird results. Jon
Jonathan Clarke wrote: > hyc@symas.com wrote: >> Ah right. For the serverID code, we're doing a liberal match because anything >> that matches the current name:port is obviously the current server, and we >> want it to be identified as such. >> >> For syncrepl, we know full well that it may be the current server, but for >> whatever reason you may want to replicate against yourself anyway (e.g., >> against a different baseDN, etc...). > Actually, this comes straight from test049 (config replication). With > multimaster config replication, it starts by replicating itself > (useless, but necessary for other masters). This problem doesn't appear > in the test case, since a variable ($URI1) is used for both the slapd -h > $URI1 option, and in the syncrepl provider=$URI1 LDIF. Otherwise it > would produce weird results. If you want a setup similar to test049, follow what it does. If you want to something else, do otherwise. There are valid reasons to point a syncrepl consumer at the same slapd (e.g., proxy syncrepl, automatic local backup/mirroring, rewriting a subtree of a main database, etc.). Using the exhaustive matching that serverID uses would preclude those cases from working. As usual, when you're configuring a server, you have to pay attention to what you're doing and what effect you want to accomplish. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
On 13.02.2009 01:23, Howard Chu wrote: > Jonathan Clarke wrote: >> hyc@symas.com wrote: >>> Ah right. For the serverID code, we're doing a liberal match because >>> anything >>> that matches the current name:port is obviously the current server, >>> and we >>> want it to be identified as such. >>> >>> For syncrepl, we know full well that it may be the current server, >>> but for >>> whatever reason you may want to replicate against yourself anyway >>> (e.g., >>> against a different baseDN, etc...). >> Actually, this comes straight from test049 (config replication). With >> multimaster config replication, it starts by replicating itself >> (useless, but necessary for other masters). This problem doesn't appear >> in the test case, since a variable ($URI1) is used for both the slapd -h >> $URI1 option, and in the syncrepl provider=$URI1 LDIF. Otherwise it >> would produce weird results. > > If you want a setup similar to test049, follow what it does. If you > want to something else, do otherwise. I am following what test049 does. Only difference is test049 starts slapd with "-h $URI1" (with URI1="ldap://localhost:9011/") and I start slapd with no -h option or a slightly different URI. This is quite common, I believe. If you patch the test as follows, it fails: 8<---------------------------------------------------- --- tests/scripts/test049-sync-config 10 Feb 2009 12:29:01 -0000 1.4.2.9 +++ tests/scripts/test049-sync-config 13 Feb 2009 10:11:22 -0000 @@ -112,7 +112,7 @@ $LDAPMODIFY -D cn=config -H $URI1 -y $CO -olcSyncRepl: rid=001 provider=$URI1 binddn="cn=config" bindmethod=simple +olcSyncRepl: rid=001 provider=`echo $URI1 | sed "s/localhost/127.0.0.1/"` binddn="cn=config" bindmethod=simple 8<---------------------------------------------------- Error: 8<---------------------------------------------------- Running ./scripts/test049-sync-config... running defines.sh Starting producer slapd on TCP/IP port 9011... Using ldapsearch to check that producer slapd is running... Inserting syncprov overlay on producer... ldapmodify failed for syncrepl config (10)! 8<---------------------------------------------------- > There are valid reasons to point a syncrepl consumer at the same slapd > (e.g., proxy syncrepl, automatic local backup/mirroring, rewriting a > subtree of a main database, etc.). Using the exhaustive matching that > serverID uses would preclude those cases from working. I see. I understand you don't want to break existing functionality. However, I still feel this is a bug, or at least that you need to "hack" the setup to get things working as expected. Would I be right in assuming that in all the "valid reasons to point a syncrepl consumer at the same slapd" you mention, the syncrepl consumer uses a different base DN than the base DN of the database that the syncrepl consumer is configured on? If so, maybe the exhaustive matching that serverID uses could be used, if and only if those two base DNs are different? Jonathan
On 13.02.2009 14:29, jclarke@linagora.com wrote: > On 13.02.2009 01:23, Howard Chu wrote: > >> Jonathan Clarke wrote: >> >>> hyc@symas.com wrote: >>> >>>> Ah right. For the serverID code, we're doing a liberal match because >>>> anything >>>> that matches the current name:port is obviously the current server, >>>> and we >>>> want it to be identified as such. >>>> >>>> For syncrepl, we know full well that it may be the current server, >>>> but for >>>> whatever reason you may want to replicate against yourself anyway >>>> (e.g., >>>> against a different baseDN, etc...). >>>> >>> Actually, this comes straight from test049 (config replication). With >>> multimaster config replication, it starts by replicating itself >>> (useless, but necessary for other masters). This problem doesn't appear >>> in the test case, since a variable ($URI1) is used for both the slapd -h >>> $URI1 option, and in the syncrepl provider=$URI1 LDIF. Otherwise it >>> would produce weird results. >>> >> If you want a setup similar to test049, follow what it does. If you >> want to something else, do otherwise. >> > I am following what test049 does. Oops, sorry, I've been mixing 049 and 050 in my head. My bad, this isn't a problem for test049 indeed. Although I am curious as to why test049 creates a syncrepl consumer targeted at itself? Can't quite get my head around it yet, but will keep thinking, with regard to multimaster.
Jonathan Clarke wrote: > On 13.02.2009 01:23, Howard Chu wrote: >> If you want a setup similar to test049, follow what it does. If you >> want to something else, do otherwise. > I am following what test049 does. Only difference is test049 starts > slapd with "-h $URI1" (with URI1="ldap://localhost:9011/") and I start > slapd with no -h option or a slightly different URI. This is quite > common, I believe. And the slapd(8) manpage clearly states that with no -h option, it defaults to "ldap:///". If the default suits your purpose, use it. If not, then specify the URL you want explicitly. Yes, it's quite common to start slapd with no -h option, because it suits the general case. This discussion is not about the general case. > If you patch the test as follows, it fails: Obviously you should not do that. >> There are valid reasons to point a syncrepl consumer at the same slapd >> (e.g., proxy syncrepl, automatic local backup/mirroring, rewriting a >> subtree of a main database, etc.). Using the exhaustive matching that >> serverID uses would preclude those cases from working. > I see. I understand you don't want to break existing functionality. > However, I still feel this is a bug, or at least that you need to "hack" > the setup to get things working as expected. The docs state that "syncrepl provider" is the LDAP URI of the master server. If you're not putting in the same URI as the master is using, that's a config error, not a bug. > Would I be right in assuming that in all the "valid reasons to point a > syncrepl consumer at the same slapd" you mention, the syncrepl consumer > uses a different base DN than the base DN of the database that the > syncrepl consumer is configured on? If so, maybe the exhaustive matching > that serverID uses could be used, if and only if those two base DNs are > different? That sounds reasonable. Will think about it a bit more. Certainly if the two base DNs are the same, it will cause a loop... -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
jclarke@linagora.com wrote: > This is a multi-part message in MIME format. Please do not send HTML/MIME emails here. > On 13.02.2009 14:29, jclarke@linagora.com wrote: > I am following what test049 does. > Oops, sorry, I've been mixing 049 and 050 in my head. My bad, this isn't > a problem for test049 indeed. > > Although I am curious as to why test049 creates a syncrepl consumer > targeted at itself? Can't quite get my head around it yet, but will keep > thinking, with regard to multimaster. Very simple. Once the real consumer connects to the master and replicates, it will acquire whatever config was on the master. If the master doesn't have a consumer pointed at itself, then once the consumer acquires the master's configuration, the consumer's syncrepl config will disappear, and replication will stop. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
changed state Open to Feedback
On 14.02.2009 01:51, Howard Chu wrote: > Jonathan Clarke wrote: >> On 13.02.2009 01:23, Howard Chu wrote: >>> If you want a setup similar to test049, follow what it does. If you >>> want to something else, do otherwise. >> I am following what test049 does. Only difference is test049 starts >> slapd with "-h $URI1" (with URI1="ldap://localhost:9011/") and I start >> slapd with no -h option or a slightly different URI. This is quite >> common, I believe. > > And the slapd(8) manpage clearly states that with no -h option, it > defaults to "ldap:///". If the default suits your purpose, use it. If > not, then specify the URL you want explicitly. Yes, it's quite common to > start slapd with no -h option, because it suits the general case. This > discussion is not about the general case. > >> If you patch the test as follows, it fails: > > Obviously you should not do that. > >>> There are valid reasons to point a syncrepl consumer at the same slapd >>> (e.g., proxy syncrepl, automatic local backup/mirroring, rewriting a >>> subtree of a main database, etc.). Using the exhaustive matching that >>> serverID uses would preclude those cases from working. >> I see. I understand you don't want to break existing functionality. >> However, I still feel this is a bug, or at least that you need to "hack" >> the setup to get things working as expected. > > The docs state that "syncrepl provider" is the LDAP URI of the master > server. If you're not putting in the same URI as the master is using, > that's a config error, not a bug. Understood. Nothing is broken, and we have worked around this issue using explicit listeners to slapd -h. This behaviour did surprise me, considering the behaviour of serverID matching. May I suggest that a note of warning is included in the admin guide? For example: http://milopita.phillipoux.net/jonathan-clarke-20090216.patch >> Would I be right in assuming that in all the "valid reasons to point a >> syncrepl consumer at the same slapd" you mention, the syncrepl consumer >> uses a different base DN than the base DN of the database that the >> syncrepl consumer is configured on? If so, maybe the exhaustive matching >> that serverID uses could be used, if and only if those two base DNs are >> different? > > That sounds reasonable. Will think about it a bit more. Certainly if the > two base DNs are the same, it will cause a loop... Indeed. For what it's worth, while testing on cn=config, I noticed this having various consequences including: 1) cn=config becoming read-only (shadow) and updates returning a referral. 2) in a MMR setup, some updates not being replicated to all N servers. Regards, Jonathan
changed notes
jclarke@linagora.com wrote: > On 14.02.2009 01:51, Howard Chu wrote: >> Jonathan Clarke wrote: >>> On 13.02.2009 01:23, Howard Chu wrote: >>>> There are valid reasons to point a syncrepl consumer at the same slapd >>>> (e.g., proxy syncrepl, automatic local backup/mirroring, rewriting a >>>> subtree of a main database, etc.). Using the exhaustive matching that >>>> serverID uses would preclude those cases from working. >>> I see. I understand you don't want to break existing functionality. >>> However, I still feel this is a bug, or at least that you need to "hack" >>> the setup to get things working as expected. >> >> The docs state that "syncrepl provider" is the LDAP URI of the master >> server. If you're not putting in the same URI as the master is using, >> that's a config error, not a bug. > > Understood. Nothing is broken, and we have worked around this issue > using explicit listeners to slapd -h. > > This behaviour did surprise me, considering the behaviour of serverID > matching. May I suggest that a note of warning is included in the admin > guide? For example: > http://milopita.phillipoux.net/jonathan-clarke-20090216.patch > >>> Would I be right in assuming that in all the "valid reasons to point a >>> syncrepl consumer at the same slapd" you mention, the syncrepl consumer >>> uses a different base DN than the base DN of the database that the >>> syncrepl consumer is configured on? If so, maybe the exhaustive matching >>> that serverID uses could be used, if and only if those two base DNs are >>> different? Actually, the exhaustive matching is only needed if the two base DNs are the same. This is now patched in HEAD, please test. >> That sounds reasonable. Will think about it a bit more. Certainly if the >> two base DNs are the same, it will cause a loop... > > Indeed. For what it's worth, while testing on cn=config, I noticed this > having various consequences including: > 1) cn=config becoming read-only (shadow) and updates returning a referral. > 2) in a MMR setup, some updates not being replicated to all N servers. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
changed notes changed state Feedback to Test moved from Incoming to Software Bugs
changed notes changed state Test to Release
changed notes changed state Release to Closed
doc added in HEAD fixed in HEAD fixed in RE24