[Date Prev][Date Next]
SQL backend architecture & some comments
Thanks for feedback! I finally have some time to answer most of the
letters, including private. I'll try to do it in _one_ letter describing
current architecture and possible directions of further development.
Hopefully it will not be too long... :)
As for sources - I'll try to port them into OpenLDAP, as Kurt D.
>Though making code available is always nice... it will be more
>usable to the general public if updated to OpenLDAP... and CVS
>really is quite easy to learn.
I do not have much time for it, so it will take a couple of days (for
now, I only got openldap-devel sources by CVS :)
As I stated before, my goal was not in making backend that stores
information in RDBMS. It means I could not develop specialized
relational schema for this purpose, like Bastiaan Bakker described in
>Simply put, a single big table contains rows with DN, attribute name
>attribute value. A prototype has been built that uses a generic CORBA
>backend interface (implemented with OmniORB2) that delegates requests
>a Java server...
The goal was to _present_ information already stored in RDBMS by some
information system, not changing its relational schema, so that this
system could function in its usual way.
Besides, the architecture Bastiaan described seems too complex to me --
too many tiers on the way to actual data... Till last few days, I
thought ODBC would be a good solution for portability, adding just one
more tier, but now that I still cannot find free driver for
Solaris/Oracle I started to doubt it... Later in this letter I'll try to
formulate my recent ideas about the problem of DBMS access.
Back to architecture. A few tables must be added to existing relational
schema. The most important two of them are LDAP_OBJCLASSES and
LDAP_ATTRS. Records in first table contains name of LDAP objectclass
(NAME), name of the table containing primary key for this objectclass
(KEYTBL), and column name with this primary key (KEYCOL). It _doesn't_
mean that ALL the data corresponding to entries of this objectclass must
reside in the table designated in KEYTBL. The only constraint such thing
gives is that primary key for it cannot be complex (it must consist of
one column). Second table contains reference to LDAP_OBJCLASSES record
(OC_ID), name of LDAP attribute (NAME), and :
- expession giving the attribute value (text you would write in the
SELECT ... clause) (SEL)
- optional definition, alias and join conditions for the table(s)
containing data for this attribute (TBL_DEF,TBL_NAME,JOIN_ON).
Records for all attributes that are required/allowed in the objectclass
must present. They are used to construct query, possibly joining several
tables, to get data for an entry. I feel my English is too bad to
clearly describe how it is done, so if anyone interested I could post
some examples later :).
Finally, there is one more table, LDAP_ENTRIES, that contains DN of the
entry (DN), reference to objectclass record (OBJCLASS), reference to
parent entry (PARENT), and integer value of primary key (KEYVAL) (this
adds another constraint -- the primary key must be an integer; this is
generally not a problem, and we could add more columns for most common
During startup the backend reads in information from LDAP_OBJCLASSES and
It then maintains one ODBC connection for each LDAP connection, upon
incoming search request in makes lookup in LDAP_ENTRIES table getting
primary keys and objectclasses for candidates, then retrieves data using
To prevent some possible questions:
Q.: why trouble with this query generation when we simply could
predefine query (for each objectclass) that retrieves information for an
entry given its primary key?
A: because, having information about each attribute, we could try to use
it to construct single query that gets all entries matching LDAP search
criteria (by processing LDAP filter into SQL expressions). It is partly
an answer to the next 2 questions.
Q: The way search is curently implemented is very ineffective compared
to what could be done by SQL-engine. Why those things from previous
answer are not done? What are alternative ways to improve effectiveness?
A: the way it works now is just the first step, because further
optimization requires that we make several principal decisions, and
other reasons are of technical matter. For instance, if we had to deal
with complex RDBMSes only, such as Oracle or Informix, we could greatly
simplify construction of queries (whether one for each entry PK, or with
translated LDAP-filter), by requiring that users define views for each
objectclass. Having data for each objectclass in one single table
(view), the work could be easier...
In fact, we could also provide stored procedures that not only select,
but also insert, delete and modify data for each objectclass...
But when I worked on the backend, I had only mySQL -- no views, no
stored procedures. Now I'm thinking to redesign LDAP_OBJCLASS to store
query analogous to view definition, and LDAP_ATTRS to store only name of
the column from this query. This provides ability to use views where
possible, and its analog in cases like mySQL.
Another principal decision is about lookups in LDAP_ENTRIES. For now
things like subtree searches are done using "select .... where dn like
'%base'", and I don't know how to index _TAILS_ of string fields... So I
don't know whether it would be more efficient to try to make _single_
query for each objectclass that performs join of real data tables with
LDAP_ENTRIES. Maybe we should simply sort candidates by objectclass and
then execute prepared query for each, which differs from currently
implemented variant only in that there's one query construction per
objclass? Or maybe we should think about more complex support for tree
structure than LDAP_ENTRIES?
Now some of the letters I received. Some them were private, but I
thought it would be useful if I posted some answers publicly.
David P. Meyers:
>Are you aware of commercial efforts to put LDAP wire protocol in front
>of relational databases? Both IBM and Oracle have products that
>provide the functionality that you describe. You can find out more
>at www.oracle.com and www.ibm.com . Just use the search engine
>when you get to those sites.
Thank you very much. I managed to find some information about LDAP
support in Oracle8i, but as I could understand, it is intended for work
with Oracle8 only... Besides, it is not given out for free... But
nevertheless, I personally have access to Oracle products, so I think
I'll manage to get idea or two from there... :)
>I am primarily interested in another problem. How to achieve high
>performance writes to an information service using the LDAP protocol.
>As you know, most LDAP implementations were designed for read
>and limited write performance.
Yes, and situation with the project I participate is driving me to
investigate possibility of such writes. The only way I see for it by now
is to provide stored procedures for each objectclass. But it is not much
better than perl-backend, since it requires knowledge of some
programming language, whether it is PL/SQL or Perl.
One more thing about perl-backend.
Kurt D. Zeilenga:
>We currently provide support for
>RDBMs through "programmable" backends such as back-tcl and back-perl.
Earlier I thought it would be too ineffective to use such things, but
now, having such troubles with ODBC, I'm starting to think about
implementing same ideas as perl module for back-perl. After all,
Perl::DBI is working anywhere, and all those hashes and regexps make
life much easier...
Also numerous people wrote me saying the idea was rather useful, asking
for sources or offering to test them on different platforms. Thanks very
much, but I think I'll try to follow Kurt's advice and port them to
OpenLDAP first. If I feel it takes too long, or if anybody requests it,
I'll post some public ftp URLs with current version... Excuse me for not
answering you personally, but I simply have no ability to do this...
Uuugh... it is probably the largest text in English I've ever written...
:) Hope it was of any good for someone...