[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Producing a web directory with LDAP



On Wed, 29 Jan 2003, Adam de Zoete wrote:
> Firstly let me say that i'm new to this list and new to LDAP. However
> i'm looking for a flat file solution (or non-RDBMS) to produce a web
> directory and am prepared to put in some learning curve in order to
> be in an appropriate format.

A number of your concerns are not specific to OpenLDAP, but rather relate
to LDAP in general or even to directory services in general.  You might
try asking the broader community at LDAP@UMich.Edu.  Several of us follow
that list as well.

> Firstly, could I be looking in the right place with LDAP? Would you
> have any other suggestions to build something like this?....
>
> The reason why I am looking at LDAP is because SQL tables can't build
> the hierarchical structure of a directory easily. I think I need a
> un-relational solution, with greater speed and flexibility.

Relational tables can represent hierarchial data, but it's not quite
natural.  Cut open a lot of directory services and you will find something
very like an RDBMS inside, thickly overlaid with code to present a
hierarchial view.

> I'm looking to build a web directory that can sit on one main server
> with localized replicas in other countries. At the moment I am
> imagining running this off either Linux or OS X boxes.

This much could be done using something like rdist to synchronize
snapshots of a master set of pages, or of datasets from which dynamic
pages are constructed.

> I have some relational data that is based on the directory. In some
> articles online it claims that LDAP can't handle relational data, or
> that it's simply not tailored for updating data on the fly. Can you
> suggest ways of handling various relational bits of data, for one
> example, hit-counters? Or any data that needs to be updated
> regularly? (None of the data needs to be accurate to the second or is
> time sensitive)

Again, you *could* make a directory look like a pool of flat tables if
you're willing to do sufficient violence to the natural modes of directory
access.  It's probably not something you want to do.  But the real
difficulty lies precisely in things like your example, where the new value
of a field depends on the previous value of that field.  Without a
transaction mechanism, there is no way to guarantee that the value you
fetched is still the current value when you submit an update.  Buried
updates are expected and tolerated in a directory, but intolerable when
doing things such as updating a counter.  Directory services assume that
the sources of all updates are external to the directory, while a DBMS
usually offers extensive features for dealing with updates based on
previous values of the same data.  A directory typically has transactional
mechanisms inside it, but does not expose those mechanisms for us to use
for our own purposes.

> The content should be more than a few million entries and I need fast
> and flexible multiple field searching.

Please specify what you mean by "multiple field searching".  If you mean
that you want to define a complex relationship among fields which
specifies a match set, then a relational language would probably be a much
better fit to your needs.

> Can anyone suggest whether LDAP is firstly capable of the above and
> secondly the the correct format for holding/searching such
> information in such quantity?

I would not choose a directory to hold the entire set of data for the sort
of service I think you envision.  But since it's all hidden behind HTTP
servers anyway, that is not necessary.  You can generate pages which
represent a view across several data repositories chosen to fit various
subsets of the data.  If the hierarchial portion of the data changes
slowly, you could keep that portion in a directory and use undisplayed
attributes of directory objects to rapidly locate corresponding records in
a DBMS.  Whether this is a good idea depends on the effort required to
maintain consistency among the repositories, i.e. the proportion of
changes which affect more than one repository.

One question you should try to answer is:  do you have a greater need for
accurate answers or for rapid response?  The underlying assumption for a
distributed directory service is that occasional inconsistency is not a
big problem, and that answers will converge toward correctness fairly
rapidly.  This is okay for looking up telephone numbers but would be
disastrous for bank balances.

> Can it be easily integrated into other networks/systems?
> What is your suggestion for the ideal middle-ware for use with LDAP?
> Should I use OpenLDAP Server or use the built in LDAPv3 in OS 10.2?
> Know of any examples of LDAP on the web I could look at?
> Is the learning curve steep?

The steepest portion of the learning curve seems to be associated with
authentication and privacy.  Of course, if your data are public knowledge,
then there's no question of authentication or privacy.

-- 
Mark H. Wood, Lead System Programmer   mwood@IUPUI.Edu
MS Windows *is* user-friendly, but only for certain values of "user".