[Date Prev][Date Next]
Re: Producing a web directory with LDAP
At 9:51 -0500 30/1/03, Mark H. Wood wrote:
On Wed, 29 Jan 2003, Adam de Zoete wrote:
Firstly let me say that i'm new to this list and new to LDAP. However
i'm looking for a flat file solution (or non-RDBMS) to produce a web
directory and am prepared to put in some learning curve in order to
be in an appropriate format.
A number of your concerns are not specific to OpenLDAP, but rather relate
to LDAP in general or even to directory services in general. You might
try asking the broader community at LDAP@UMich.Edu. Several of us follow
that list as well.
Thanks, Mark, I'll try that.
> Firstly, could I be looking in the right place with LDAP? Would you
have any other suggestions to build something like this?....
The reason why I am looking at LDAP is because SQL tables can't build
the hierarchical structure of a directory easily. I think I need a
un-relational solution, with greater speed and flexibility.
Relational tables can represent hierarchial data, but it's not quite
natural. Cut open a lot of directory services and you will find something
very like an RDBMS inside, thickly overlaid with code to present a
Say my directory is 6 categories deep. In order to find the path to
the root category with an RDBMS i'd have to perform 5 or 6 searches
just to plot a bread crumb trail. That's ideally what i'm trying to
> I'm looking to build a web directory that can sit on one main server
with localized replicas in other countries. At the moment I am
imagining running this off either Linux or OS X boxes.
This much could be done using something like rdist to synchronize
snapshots of a master set of pages, or of datasets from which dynamic
pages are constructed.
That sounds ideal for localized deployment.
> I have some relational data that is based on the directory. In some
articles online it claims that LDAP can't handle relational data, or
that it's simply not tailored for updating data on the fly. Can you
suggest ways of handling various relational bits of data, for one
example, hit-counters? Or any data that needs to be updated
regularly? (None of the data needs to be accurate to the second or is
Again, you *could* make a directory look like a pool of flat tables if
you're willing to do sufficient violence to the natural modes of directory
access. It's probably not something you want to do.
Correct I want to follow those 'natural modes of directory access'
you're mentioning. Is LDAP the only protocol to offer this?
But the real
difficulty lies precisely in things like your example, where the new value
of a field depends on the previous value of that field. Without a
transaction mechanism, there is no way to guarantee that the value you
fetched is still the current value when you submit an update.
But i'm not specifically worried about that because 90% of the
updated data is not time sensitive. I just have to perform updates to
values frequently, but these values are not time sensitive.
updates are expected and tolerated in a directory, but intolerable when
doing things such as updating a counter. Directory services assume that
the sources of all updates are external to the directory, while a DBMS
usually offers extensive features for dealing with updates based on
previous values of the same data. A directory typically has transactional
mechanisms inside it, but does not expose those mechanisms for us to use
for our own purposes.
The content should be more than a few million entries and I need fast
and flexible multiple field searching.
Please specify what you mean by "multiple field searching". If you mean
that you want to define a complex relationship among fields which
specifies a match set, then a relational language would probably be a much
better fit to your needs.
No I would like a single search to look into multiple categories. I
would like a search to be performed on Title, Description, Keyword at
once. I imagine it's easy through partitioning the search-able data.
> Can anyone suggest whether LDAP is firstly capable of the above and
secondly the the correct format for holding/searching such
information in such quantity?
I would not choose a directory to hold the entire set of data for the sort
of service I think you envision. But since it's all hidden behind HTTP
servers anyway, that is not necessary. You can generate pages which
represent a view across several data repositories chosen to fit various
subsets of the data. If the hierarchial portion of the data changes
slowly, you could keep that portion in a directory and use undisplayed
attributes of directory objects to rapidly locate corresponding records in
a DBMS. Whether this is a good idea depends on the effort required to
maintain consistency among the repositories, i.e. the proportion of
changes which affect more than one repository.
One question you should try to answer is: do you have a greater need for
accurate answers or for rapid response?
90% is rapid response with that 10% being accurate and more time sensitive.
The underlying assumption for a
distributed directory service is that occasional inconsistency is not a
big problem, and that answers will converge toward correctness fairly
rapidly. This is okay for looking up telephone numbers but would be
disastrous for bank balances.
This inconsistency margin falls into the same ratio as above. But non
of this data is remotely as sensitive as bank balances.
> Can it be easily integrated into other networks/systems?
What is your suggestion for the ideal middle-ware for use with LDAP?
Should I use OpenLDAP Server or use the built in LDAPv3 in OS 10.2?
Know of any examples of LDAP on the web I could look at?
Is the learning curve steep?
The steepest portion of the learning curve seems to be associated with
authentication and privacy. Of course, if your data are public knowledge,
then there's no question of authentication or privacy.
I think my real problem is the fact that I am dealing with relational
and un-relational data. 90% is un-relational and would benefit from
LDAP, 10% is relational (almost like a shopping cart, but the
contents aren't that important). The data is intermingled, i.e. I
can't display un-relational data (from LDAP) without having to relate
to the relational stuff.
However, I think I would be shooting myself in the foot building a
directory in an RDBMS. I am hoping that as I get a clearer picture of
this development, I should be able to envisage a way of using LDAP
and SQL to form something that could work between the two.
So I have the speed and flexibility of LDAP with SQL offering the
more human relational level.
// Adam de Zoete