[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#3939) min/max function extension to LDAP protocol




--On Saturday, August 20, 2005 8:33 AM -0700 "Kurt D. Zeilenga" 
<Kurt@OpenLDAP.org> wrote:

>> No, it is the provider.
>
> The provider doesn't need a protocol extension to talk to
> itself.  As I noted, an ordering index will do.
>
>> The provider is trying to determine if the cookie
>> sent to it is still valid or if changes need to be sent to the consumer.
>> When the consumer's cookie is different than the providers, it does an
>> internal search to determine if that cookie is still valid for its
>> database.  This is extremely expensive.
>>
>>
>> For example, if the provider has a cookie of:
>>
>> 20050818185717Z#000001#00#000000
>>
>> and the consumer has a cookie of:
>>
>> 20050818185716Z#000001#00#000000
>>
>> The provider is going to run the following search:
>>
>> entrycsn <= 20050818185716Z#000001#00#000000
>>
>> to determine if there is one or more entries with a matching CSN.  If
>> there  isn't, the provider refreshes the consumer with all data.  If
>> there is, it  then only does an update of changes from the time the
>> consumer last  received data.
>
> The provider will likely be forced to use updates+present mode
> here as the fact that it founded matching entries says nothing
> about how many deleted entries it didn't find.

But this search isn't done to see what entries are needed to be found for 
update purposes.  It is simply done to determine whether or not the cookie 
sent by the replica is still valid (i.e., there is at least one entry in 
the database hasn't been updated since the replica last talked to the 
provider).

Determining what updates need to be sent to the replica is done later.  I 
already use the syncprov-sessionlog overlay on the provider side so that it 
has a log of changes (including deletes) to send to the replica *after* it 
is determined if it needs a full sync or not based on the validity of its 
cookie.


>> The problem is the <= search done by the provider.  That needs to be
>> avoided.  If it was simply able to look at the lowest value CSN in its
>> database, it would immediately be able to tell whether or not the
>> consumer  needs only a partial update, or a full update, and skip the
>> entire <=  search that can generate an enormous number of results
>> altogether.
>
> This seems a bit of an oversimplification.  CSNs of existing
> entries say nothing about CSNs of deleted entries.  Those,
> as well as entries that have moved through scope of the
> search, are the nasty ones.   (I note that subtree rename
> introduces other complications.)

As noted above, this ITS isn't about what needs to be updated.  It is only 
about the expensive search being done by the provider to validate a 
replica's cookie.  This is done before it determines what updates need to 
be done to the replica.

--Quanah

--
Quanah Gibson-Mount
Principal Software Developer
ITSS/Shared Services
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html

"These censorship operations against schools and libraries are stronger
than ever in the present religio-political climate. They often focus on
fantasy and sf books, which foster that deadly enemy to bigotry and blind
faith, the imagination." -- Ursula K. Le Guin