[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Thoughts on simple paged search results



At 09:24 PM 2002-01-05, Debajyoti Bera wrote:
>Hi
>        I thought of working on implementing OpenLDAP paged search result. I am 
>jotting down my observations and thoughts below.
>        Its clear that such a facility can't be built in to ldapsearch without 
>support from backend. So primarily some backend routines are to be written to 
>support some sort of range search. Now OpenLDAP supports a large number of 
>backends, but it seems that the major ones are back-ldbm and back-bdb. 
>However, back-bdb is still in rapid stages of development (implementation of 
>indices are currently being carried out), so back-ldbm may be the place to 
>start with.
>        Now, the ldapsearch routine just calls the appropriate search function for 
>the specified backend and passes it all the search paramaters alongwith 
>(connection *conn, operation *op). The cursors of the databases can be used to 
>store the point from which search for the next set of results can be started. 
>As this seems to be some sort of persistent feature for that connection, it 
>easier to store the cursors in 'conn'. Besides the cursor, the backend 
>specific search depends on an array candidates[] of IDs. But the array is 
>quite a large one, so currently we can store only a few such strcutures in 
>conn (i.e. effectively limiting the number of simultaneous paged based search 
>a client can do on one connection). So the changes go like this (I was 
>changing them in the files but due to my incoming exams, I have stopped 
>temporarily):
>
>in struct slap_conn add field
>void *paged_search;

I have a few thoughts on this as well...

Probably should point to a struct which has common fields
and a pointer to backend-specific information.  Something
like
        struct slap_paged_results *paging_ctx;

(support only one context initially, it can be easily changed
later to support multiple contexts)

   struct slap_paged_results {
        Backend         *spr_be;
        unsigned long   spr_opid;
        HASHdigest      spr_digest;     
        ID              spr_cursor;
        void            *spr_be_specific;
   }
The digest is a hash of all the search parameters which are
specific to the initial page search request plus the initial
(or "last") operation id.

The cookie can be either the digest or the opid or
something derived from both.  I'd likely use the opid
and avoid the separate spr_cookie field.   I might
actually use the "last" operation id.

The spr_be_specific pointer would point to a candidates
and/or other information needed.  For now, I'd just ch_free()
as needed.  But a backend specific routine could be provided
and called if needed (hence spr_be).

The spr_be would also be used by the frontend, which would
parse the client control, for additional safety checks.

I'm sure there are a lot of other issues involved, such
as how to deal with spanning databases where not all support
the control.


>During initialization of backends, this has to be properly initialised. For
>back-ldbm (and similarly for back-bdb)
>in back-ldbm (and similarly in back-bdb)
>
>struct ldbm_paged_search_t{
>        ID cursor;
>        ID_BLOCK *candidates;
>        //some other info
>        char *cookie;
>        int index; //index into the persistent_search array
>        //some verification inforamtion like
>        char *base_dn; //etc
>};
>#define LDBM_MAX_PAGED_SEARCH_CONNECTION 2
>struct ldbm_paged_search_t ldbm_paged_search[LDBM_MAX_PAGED_SEARCH_CONNECTION];
>paged_search=(void *)ldbm_paged_search;
>
>Then when in ldapsearch (i.e. servers/slapd/search.c) the controls are
>detected, these fields are initialized if there is a paged search control. The
>search routine is similar, only the candidates[] and cursor are not to be
>generated but taken from the conn->paged_search[...].cursor etc. The
>appropriate index in the ldbm_paged_search[...] has to be found which may be
>done by inserting the index number in the cookie itself or by searching the
>presented cookie in all cookies (the size of the array ldbm_paged_search won't
>be too large, so a linear search would do). Thus the when the do_search()
>calls backend specific search function, it has to pass the new values like the
>number of results desired, value of cookie etc. which can be passed by
>creating another data type:
>typedef struct paged_control_t{
>        char *cookie;
>        int range;
>        //etc...
>} paged_control;
>
>To protect against non-conformant clients, the search routines should check
>that the baseDN, scope etc in all the successive retreivals of a paged search
>is the same as that of the first request. This check can be done in back-end
>specific search routines as there we have the values for the current search
>and as well as from the first search (during the first search the fields in
>the lbdm_paged_search are initialised with corresponding values).
>Lastly, care has to be taken to appropriately free the memory locations which
>can be done as and when we know that the connection is closed (sort of
>registering a callback with the destruction of conn - I didnot think on this
>much).


>I started implementing but had to stop at middle. The changes are not enough
>to be submitted.
>Sincerely,
>Debajyoti
>
>
>
>