[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: synchronisation of LDAP with non-ldap data

denis.havlik@t-mobile.at wrote:
Hi again! :-)

I have a bunch of data about employees that is produced by SAP (plus some more data from other sides), and exported to me on a daily basis. At first, an LDAP tree is built from the SAP exported data, then some aditional data is generated on LDAP itself, so I can't simply throw all away every night and rebuild from scratch.

How long do you need to import 2000 users into LDAP?

 From time to time following things will happen:

* old employees go away,
* new people arrive,
* some folks change name (or other data)

We've solved this via an Perl-LDAP-Script which runs a "diff" between the export (talking about a CSV file with user data). This script needs some time to execute, since it always compares all users (we're talking about 2700 users). In fact it would be better if we run a diff between the latest and the actual export and only check the differences via LDAP which would be faster (especially since there usually are not very much changes). We have about 300 new users each year, and 300 will be deleted and during the year there is not much that changes.

Thus, I need to practically check all the LDAP user entries every night, find the differences between "now" and "before", add new users, delete (or at least somehow disable) old users, and possibly change some info for others. The question is: "what is the best way to do this":

Even if I wrote our import/update script myself, I wouldn't trust it to delete old users automatically, better would be a simple list of deletable users, and you have to acknowledge this list before deletion is executed.

1) keep the "old" SAP export, do a "diff" over old and new one, and do the LDAP stuff only on the difference.

I think this would be the fastest solution - if you can configure the diff that it only reports differences that are relevant for LDAP.

2) Go with "brute force", and do a sweep over the SAP list, controlling every LDAP entry, followed by another sweep to find the folks that aren't in SAP list anymore. (OK, i could add "last updated" field, and then disable all the entries that weren't updated)

This is what we're doing, as I said it needs some time to execute (at least on our small servers), but it should be no problem. We usually have a lot of updates to our LDAP database at night, since we're generating/updating a lot of groups automatically based on the user entries and some other external databases. In fact we're building a hashtable with all uid's that are present in LDAP, then we process the external CSV file, and mark all uid's as completed that we found in CSV and add the ones we have not in the hashtable. The entries that are not marked only exist in LDAP and can be deleted.

3) Do the "diff" as in 1, check that the diff isn't too big and then proceed with brute force as in 2.

If the diff is big, then it will not be different from your brute-force method (but maybe a bit faster?).

3) Something more inteligent?

At this moment, I have 2000 users to worry about. If/when someone decides that they need such a system in the whole company this number might grow by factor of 10-20, and the whole rebuild should not take more than a few minutes. Alternatively, a longer update time (up to 1h) withouth service disruption would do.

If you can say that there are not much changes during the update period I think there would be no problem if it needs some time. If your user data changes frequently (and changes during the update) you should have no problems, but if the update and some other process are changing the same entry at the same moment you may have some problems.

Personally, I would prefer to do something "elegant" to "brute force" aproach, but I'm worried about the possibility of failing to update some entries (whatever the reason) one day, and thus causing a complete desaster the next day. Since LDAP does not offer transactions support, the whole logic must be

Maybe that hashtree-idea isn't that bad?

What is considered a "best practice" here?

PS: I can't do a "push" action that would update LDAP entry whenever an SAP entry changes yet, because that could take forever to set up.

We had the same problem with Lotus Domino (even if Domino has an LDAP interface).

best regards,
          \\\ ||| ///                               _\=/_
           (  @ @  )                                (o o)
| Markus Schabel      TGM - Die Schule der Technik   www.tgm.ac.at |
| IT-Service          A-1200 Wien, Wexstrasse 19-23  net.tgm.ac.at |
| markus.schabel@tgm.ac.at                   Tel.: +43(1)33126/316 |
| markus.schabel@members.fsf.org             Fax.: +43(1)33126/154 |
| FSF Associate Member #597, Linux User #259595 (counter.li.org)   |
|        oOOo        Yet Another Spam Trap:     oOOo               |
|       (    )    oOOo    yast@tgm.ac.at       (   )     oOOo      |
+--------\  (----(   )--------------------------\ ( -----(   )-----+
          \_)     ) /                            \_)      ) /
                 (_/                                     (_/

Computers are like airconditioners:
  They stop working properly if you open windows.