[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: question



Lord.

Okay, start by ejecting from your mind that LDAP is equivalent in some
way to RDBMS-es. LDAP isn't.

RDMBS's work off of sets of defined tuples, in SQL parlance "tables".
These tuples are designed to be qualified, joined, excluded, sorted,
manipulated, etc. The data model is two-dimensional -- lists of tuples.

LDAP is a protocol. That's it. It's a way of accessing data. The
representation of the data as returned by the protocol is not lists of
tuples but rather a tree hierarchy. The tuples are not rigidly defined
like in SQL (in parlance "columns") -- they are modifiable in ordered
fashions on a per record basis by specifying objectClass values. LDAP
searches and returns data in a tree fashion. The best backends represent
the data in tree like relations. However, as LDAP is merely a protocol,
it is up to the backend to decide how to store that data. You can even
use a RDBMS to store the data, however as I pointed out above RDBMS-es
do not efficiently store tree shaped data -- their data model is lists
of fixed-length tuples.

In my experience, with a well tuned Sleepycat database using the BDB
backend in OpenLDAP with indices defined intelligently and a
multiple-level depth tree structure, OpenLDAP runs like a meth head at a
discotheque.

You are correct in one sense, though. If I do a search on a tree with a
scope of one-level and a simple filter which looks for a single value on
an indexed field (e.g. search for uid=foo with scope of one where you
have an equality index on uid) or I do a SQL query on a single table for
a single instance of an indexed column (e.g. "select * from some_table
where uid = 'foo'" where uid is indexed), you're not going to see much
difference. There's only so fast that it can run, and both SQL and
OpenLDAP's BDB backend generally choose the most efficient plan for that
search.

But most searches are not like that. You have joins in SQL. You have
more complex filters in OpenLDAP. And the way that the BDB backend
searches its hash tables for a complex filter and the way that MySQL
searches its hash tables for a unioned left outer joined grouped query
with "having" and "where" restraints and sorted according to two fields
-- you get the idea. The multiple indices using different conditions and
limited flexibility in data selection means that the bdb backend is
disgustingly optimized for queries, and less so but still exceptionally
fast for updates/deletes.

Don't compare apples and oranges. If you are starting totally from
scratch (as in you don't have a currently implemented system), ask
yourself:

- what data model best represents my data? Tree hierarchy or tabular?
- what applications would I possibly use and what systems do they
support? MySQL or LDAP?
- how often will I be querying versus writing? OpenLDAP using the bdb
backend is best in situations where you read a thousand times for every
write

And if you're starting from scratch, try them out! Tune them both,
hammer them both, and see which one holds up best under stress for your
particular situation.

But don't assume there's too much similar between LDAP and SQL.

-jag

On Thu, 2004-07-15 at 08:46, Carlo Truijllo wrote:
> Uh ? 
> only indexes?
> Also in your RDBMS you can put tons of index using same feature like BTREEs 
> and so on... but it isn't the same thing... why? ;c)
> A totally newbie asked me "can you explain me easily why it is so fast?" eheh 
> ... hard question ! ;)
> Carl
> 
> Alle 16:29, giovedì 15 luglio 2004, Ottavio Campana ha scritto:
> > Carlo Truijllo ha scritto:
> > > Hi folks
> > > 	I wonder how openldap is faster in read than in write, in other word ...
> > > I know that there is a fast protocol below ( faster than old X.500) but I
> > > wonder what kind of process or algorithm is based?
> > > it is only due to  the speed of backend ?
> > > and... with a backend like SQL how can it be faster than a simple RDBMS ?
> > > How it is organized in poor words ?
> >
> > Just think about ldap's indexes . If you use several index, when you add
> > an entry you need to update all the indexes, but when you read something
> > you can just select the best index and extract the data from it.
> >
> > This is why reading is faster then writing.
-- 
Joshua Ginsberg <joshg@brainstorminternet.net>
Brainstorm Internet Network Operations