[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
back-sql: inefficient query with ldap_entry_objclasses
- To: openldap-devel@OpenLDAP.org
- Subject: back-sql: inefficient query with ldap_entry_objclasses
- From: Aike Reyer <aike@users.sourceforge.net>
- Date: Wed, 30 Nov 2005 16:43:54 +0100
- Cc: Thomas Hoth <tho@lihas.de>
- Content-disposition: inline
- In-reply-to: <63186.81.72.89.40.1129733080.squirrel@81.72.89.40>
- References: <20051018123943.GA19243@lihas.de> <61236.81.72.89.40.1129709695.squirrel@81.72.89.40> <20051019092729.GA7953@lihas.de> <63186.81.72.89.40.1129733080.squirrel@81.72.89.40>
- User-agent: Mutt/1.4.2i
Hello,
I had a closer look at Thomas' problem[1] with back-sql. The problem was with
a query like this, that took too much time:
SELECT DISTINCT
ldap_entries.id,groups.id,text('posixGroup') AS objectClass,ldap_entries.dn AS dn
FROM ldap_entries,groups,ldap_entry_objclasses,posix_account,groups_users
WHERE groups.id=ldap_entries.keyval
AND ldap_entries.oc_map_id=4
AND 9=9
AND groups_users.memberuid=posix_account.uidnumber
AND groups_users.gidnumber=groups.gidnumber
AND ( (2=2 OR ( ldap_entries.id=ldap_entry_objclasses.entry_id
AND ldap_entry_objclasses.oc_name='posixGroup'))
AND ((posix_account.uid='lml34') OR 7=7))
As Pierangelo Masarati already wrote[2] TRUE OR query == TRUE. So
(2=2 OR ( ldap_entries.id=ldap_entry_objclasses.entry_id
AND ldap_entry_objclasses.oc_name='posixGroup'))
is the same as 2=2.
If this is the case, then the table ldap_entry_objclasses is not conditionally
joined with the other tables in the where-clause. So the whole table is
cross-joined with the actual result. Multiple rows are then sorted out again
with DISTINCT.
However in this query the values of ldap_entry_objclasses are not used for
anything. So the table could be purged from this query:
SELECT DISTINCT
ldap_entries.id,groups.id,text('posixGroup') AS objectClass,ldap_entries.dn AS dn
FROM ldap_entries,groups,posix_account,groups_users
WHERE groups.id=ldap_entries.keyval
AND ldap_entries.oc_map_id=4
AND 9=9
AND groups_users.memberuid=posix_account.uidnumber
AND groups_users.gidnumber=groups.gidnumber
AND ( (2=2)
AND ((posix_account.uid='lml34') OR 7=7))
This is easily done by changing the code around line 733 in back-sql/search.c
(version 2.3.11):
static struct berval ldap_entry_objclasses = BER_BVC( "ldap_entry_objclasses" );
backsql_merge_from_tbls( bsi, &ldap_entry_objclasses );
backsql_strfcat( &bsi->bsi_flt_where, "lbl",
(ber_len_t)STRLENOF( "(2=2 OR (ldap_entries.id=ldap_entry_objclasses.entry_id AND ldap_entry_objclasses.oc_name='" /* ')) */ ),
"(2=2 OR (ldap_entries.id=ldap_entry_objclasses.entry_id AND ldap_entry_objclasses.oc_name='" /* ')) */,
&bsi->bsi_oc->bom_oc->soc_cname,
(ber_len_t)STRLENOF( /* ((' */ "'))" ),
/* ((' */ "'))" );
to:
backsql_strfcat( &bsi->bsi_flt_where, "l",
(ber_len_t)STRLENOF( "(2=2)",
"(2=2)");
I changed this and all the results I checked (including other queries using
this part of code) were as expected.
I also did some tests if backsql_merge_from_tbls( bsi, &ldap_entry_objclasses )
is necessary for any other part. But it doesn't seem so.
The above query was used to do a "su username" on a system which stores about
5000 posix and samba users and 1000 groups in a PostgreSQL-DB. Without this
change it was not possible to use the command. I aborted it after about
15 minutes. With this change it took about 20 seconds. Which is still slow but
much better.
Now, of course I wonder why this code is at it is. So could someone please
check if I am right with my assumptions and if the change would affect
anything else?
aikE
[1] http://www.openldap.org/lists/openldap-software/200510/msg00282.html
[2] http://www.openldap.org/lists/openldap-software/200510/msg00318.html