[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Fwd: I-D ACTION:draft-rharrison-bulkldif-00.txt



Roger Harrison wrote:

> Comment:
> I'm surprised that the motivation was to avoid 'the overhead of responses for each operation.'  Is this really a large overhead?
> 
> Response:
> The overhead I refer to is not network overhead but is due to the fact that the client, if making requests synchronously, must wait for the response to each operation before requesting the next one. (I believe that this is the way ldapmodify currently works, and it is slow.)  If the client is making requests asynchronously, there is no way to ensure that the operations will be received and applied to the DIT in the same order that they occurred in the LDIF file. I discuss this at more length below.

I see, you're worried about a lack of pipelining. 
But a slightly less than totally dumb client can
support pipelining of LDAP operations. Ordering
of operations then becomes an issue. The client
has to notice ordering sensitivity in the data
and interlock the pipeline appropriately, OR
the operations need to be sent with an implied
ordering and the server then needs to observe
that ordering.

> 
> Comment:
> Sending LDIF on the wire seems messy. Why do this rather than send the entries in the same wire format they'd have in an LDAP add operation?
> 
> Response:
> Sending LDIF on the wire pushed most of the work of implementing this extension onto the server.  This is in keeping with RFC 2251 section 3.1 which states, "it is an objective of this protocol to minimize the complexity of clients." Based on this draft, the client doesn't have to have any knowledge of the LDIF file format.  The client just reads data from a file--which it assumes is in LDIF format--and sends chunks to the server; the chunks do not even have to correspond with LDIF record boundaries.

Surely clients are already able to parse ldif and form the approriate
ldap PDUs (e.g. ldapmodify can do this,
and it's only a few tens of lines of code).
 
> Sending LDIF across the wire has other benefits.  First, it maintains the ordering of operations coming from the LDIF file. The latest LDIF draft states, "An LDIF file consists of a series of records separated by line separators."  The term "series" implies that the records are sequenced in order from the beginning to the end of the file. While it is possible to have the client send many asynchronous LDAP requests, one for each record in the LDIF file, there is no way to enforce the ordering of those asynchronous LDAP requests.

Interesting. If ordering is the real problem, I'd probably try to solve
just that
issue with multiple LDAP add operations (e.g. by introducing some
ordering control).

> Comment:
> How does this relate to the LDUP work?  It seems to me that the task of initializing a replica server over-the-wire is quite close to the intended purpose of this proposal.
> 
> Response:
> This extension is not intended specifically for the task of initializing a replica server over the wire.  It is intended to facilitate the remote application of directory data to a DIT in a widely available format. While administrators may frequently use this extension to initialize a server's data, they may use it even more frequently to import data for a batch of new users or new customers or to apply other batches of changes.  The number of records for this type of scenario might range into the tens or hundreds of thousands per use based on what our customers tell us they'll do.
> 
> There may be some overlap in functionality between LDUP and this extension, but this extension is not intended to do replication.

The point of my comment was that if a server already implemented
the LDUP protocol, then wouldn't it make sense to use and perhaps
extend that for the task at hand, rather than defining yet another
similar but different mechanism ?

> 1. The client would parse the LDIF file into individual records and submit each record as an asynchronous LDAP request.
> 
> 2. To preserve ordering, the client would add a control to each request identifying the record number.  This will allow the server to delay processing a re-ordered request until the records that came before it in the LDIF file are processed.
> 
> 3. The server would simply reply to each request as it would any other LDAP request; the client would be completely responsible for dealing appropriately with error conditions.
> 
> Advantages of this approach are that the server wouldn't have to provide any new functionality except for support for the control.  Existing LDAP protocol operations would be used to transmit the requests, and the data would be sent in BER-encoded form.  The client would receive notifications for every completed request which would make it easy to keep track of the completion status.
> 
> Disadvantages of this approach are that the client would have to include support for full parsing of LDIF files to encode the LDAP requests.  The client would be completely responsible for dealing with critical errors and possibly backing out changes made during the abandon phase of the transmission after receiving a critical error. It would also require server support for a new control.

I don't see the ldif parsing issue to be a real problem. Clients can
already do this.
The transacted nature of the operation could be specified with yet more
controls on the operations. I suspect that servers would find it hard
to implement this generically, more likely these groups of operations
would only be permitted on predefined DIT partitions.