Issue 5472 - ldap_get_values() should handle paged results from LDAP/AD
Summary: ldap_get_values() should handle paged results from LDAP/AD
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-04-16 14:58 UTC by pere@hungry.com
Modified: 2011-01-26 20:07 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description pere@hungry.com 2008-04-16 14:58:10 UTC
Full_Name: Petter Reinholdtsen
Version: 2.1.30
OS: Debian GNU/Linux Etch
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (2001:700:100:6:213:72ff:fe93:c639)


I ran into this problem when trying to use nss-ldapd with LDAP
from an Microsoft Active Directory server.  The problem only appear if there
are more than 1500 members in a group.  When there are fewer than 1500 members,
the result from the LDAP server look like this:

  member: CN=user1,OU=Elever,OU=ULS,OU=VG,OU=Skoler,DC=SKOLEN,DC=LOCAL
  member: CN=user2,OU=Ansatte,OU=ULS,OU=VG,OU=Skoler,DC=SKOLEN,DC=LOCAL

This is properly handled by ldap_get_values(), and the nss-ldapd module work
properly.  For groups with more than 1500 members, the result from the LDAP
server
look like this:

  member;range=0-1499:
CN=user1,OU=Elever,OU=OVO,OU=VO,OU=Skoler,DC=SKOLEN,DC=LOCAL
  member;range=0-1499:
CN=user2,OU=Ansatte,OU=OVO,OU=VO,OU=Skoler,DC=SKOLEN,DC=LOCAL

This notation is not handled by ldap_get_values(), and it return NULL, resulting
in
a group with zero members.  Is there a way to parse such "paged" attributes
using
the openldap library, and could ldag_get_values() be changed to handle these?

Is the range= notation legal LDAP notation?  I have been unable to find
information
about this in any RFC, but our resident LDAP expert mentioned that it could be
according to some extention specification.  Have not been able to find
information
about it.

To get the rest of the members I have to ask for attribute 'member;range=1500-*'
and
repeat this until the result show for example 'range=6000-*' to indicate that
this is the last batch of members.

Comment 1 Kurt Zeilenga 2008-04-16 16:06:24 UTC
On Apr 16, 2008, at 7:58 AM, pere@hungry.com wrote:
> Full_Name: Petter Reinholdtsen
> Version: 2.1.30
> OS: Debian GNU/Linux Etch
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (2001:700:100:6:213:72ff:fe93:c639)
>
>
> I ran into this problem when trying to use nss-ldapd with LDAP
> from an Microsoft Active Directory server.  The problem only appear  
> if there
> are more than 1500 members in a group.  When there are fewer than  
> 1500 members,
> the result from the LDAP server look like this:
>
>  member: CN=user1,OU=Elever,OU=ULS,OU=VG,OU=Skoler,DC=SKOLEN,DC=LOCAL
>  member: CN=user2,OU=Ansatte,OU=ULS,OU=VG,OU=Skoler,DC=SKOLEN,DC=LOCAL
>
> This is properly handled by ldap_get_values(), and the nss-ldapd  
> module work
> properly.  For groups with more than 1500 members, the result from  
> the LDAP
> server
> look like this:
>
>  member;range=0-1499:
> CN=user1,OU=Elever,OU=OVO,OU=VO,OU=Skoler,DC=SKOLEN,DC=LOCAL
>  member;range=0-1499:
> CN=user2,OU=Ansatte,OU=OVO,OU=VO,OU=Skoler,DC=SKOLEN,DC=LOCAL
>
> This notation is not handled by ldap_get_values(), and it return  
> NULL, resulting
> in
> a group with zero members.

This is proper and well-intended behavior.  You asked for values of  
returned under the attribute description "member", not the (invalid)  
attribute description "member;range=0-1499".  Two attribute  
descriptions which share the same attribute type do not necessarily  
refer to the same attribute.

> Is there a way to parse such "paged" attributes
> using
> the openldap library, and could ldag_get_values() be changed to  
> handle these?
>
> Is the range= notation legal LDAP notation?

No.  Attribute description options cannot contain equal signs.  See  
RFC 4512.

> I have been unable to find
> information
> about this in any RFC, but our resident LDAP expert mentioned that  
> it could be
> according to some extention specification.

Microsoft might offers some specification for this crap.  But I note  
that it's an improper extension as extensions should be truly optional  
(per RFC 4521 and common sense).

> Have not been able to find
> information
> about it.
>
> To get the rest of the members I have to ask for attribute  
> 'member;range=1500-*'
> and
> repeat this until the result show for example 'range=6000-*' to  
> indicate that
> this is the last batch of members.


If you want to implement this crap, you can do so without additional  
support from LDAP API.  Use ldap_first/next_attribute API.

-- Kurt

Comment 2 Howard Chu 2008-04-16 21:06:17 UTC
changed state Open to Closed
Comment 3 pere@hungry.com 2008-04-21 05:41:09 UTC
Just a quick reply for future reference if others ask about the same
and find this request.

[Kurt Zeilenga]
> No.  Attribute description options cannot contain equal signs.  See
> RFC 4512.

Thank you for your reply.  It was very valuable for me that is not as
well lectured in the LDAP specification.  It is now obvious to me that
the extention used by Active Directory LDAP is outside the
RFC-documented LDAP specification.

I've been told that AD range feature is documented in an expired draft
RFC, available from
<URL: http://www.tkk.fi/cc/docs/kerberos/draft-kashi-incremental-00.txt >.
I'm not sure what was discussed about this draft, but it expired a
long time ago.  Anyway, the draft can be used to understand how the
feature is working.  It claim that it is possible to see in
supportedControls if this range feature is used by the server.  This
could be used to enable this feature at runtime, if one wanted to
implement the non-conforming feature.  I suspect I have to go in that
direction, as the project requirements are to use LDAP from AD. :/

> If you want to implement this crap, you can do so without additional
> support from LDAP API.  Use ldap_first/next_attribute API.

Good idea.  I have since found out that this ranged multivalue feature
is implemented in nss-ldap, and hope it is possible to reuse some code
there in nss-ldapd.

Happy hacking,
-- 
Petter Reinholdtsen

Comment 4 sgallagh@redhat.com 2011-01-26 15:43:04 UTC
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'd like to reopen the discussion on this issue. We're hitting this same
problem with the SSSD when dealing with ActiveDirectory. It really
doesn't make sense to me that every consumer of the OpenLDAP libraries
should be required to reimplement this (admittedly incorrect) extension
to ActiveDirectory.

As Petter suggested in his comment from April 21, 2008, ActiveDirectory
provides a server control to identify that the feature is in play.

I feel that it would be beneficial to OpenLDAP's library consumers if
they handled range lookups automatically and internally, similar to the
way that referrals are chased.

Consumers of the OpenLDAP API should be able to reliably assume that if
they ask for the set of values for an attribute of a completed request,
that they will get back all of the values.

Please reconsider adding this support into OpenLDAP.

- -- 
Stephen Gallagher
RHCE 804006346421761

Delivering value year after year.
Red Hat ranks #1 in value among software vendors.
http://www.redhat.com/promo/vendor/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAk1AQQgACgkQeiVVYja6o6OACwCgg4t5Mm54X+ohYzhvG9fRBMTb
gO4An0j/152a0PHFMTxTIuQ6kamlcsFb
=3a3m
-----END PGP SIGNATURE-----

Comment 5 ando@openldap.org 2011-01-26 17:15:11 UTC
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> I'd like to reopen the discussion on this issue. We're hitting this same
> problem with the SSSD when dealing with ActiveDirectory. It really
> doesn't make sense to me that every consumer of the OpenLDAP libraries
> should be required to reimplement this (admittedly incorrect) extension
> to ActiveDirectory.
>
> As Petter suggested in his comment from April 21, 2008, ActiveDirectory
> provides a server control to identify that the feature is in play.
>
> I feel that it would be beneficial to OpenLDAP's library consumers if
> they handled range lookups automatically and internally, similar to the
> way that referrals are chased.
>
> Consumers of the OpenLDAP API should be able to reliably assume that if
> they ask for the set of values for an attribute of a completed request,
> that they will get back all of the values.
>
> Please reconsider adding this support into OpenLDAP.

The complexity of handling this nonsense in libldap seems not worth the
effort; I think we might consider working this around in proxy backends
(much like we did for unsolicited paged results response in back-meta,
ITS#6664, which could be added to back-ldap as well).

I don't think implementing something that requires a theoretically
unbounded number of nested search requests for each attribute value that
contains a range in each SearchResultEntry message makes sense.

The parallel with referrals is not appropriate, since referrals are part
of LDAP specification; also, please note that automatic referral chasing
is strongly discouraged unless the transport layer is protected (Section 6
of RFC 4511).

p.

Comment 6 ando@openldap.org 2011-01-26 17:17:06 UTC
For future reference, if needed:

<http://msdn.microsoft.com/en-us/library/aa367017(v=vs.85).aspx>

Comment 7 Kurt Zeilenga 2011-01-26 18:11:36 UTC
On Jan 26, 2011, at 9:15 AM, masarati@aero.polimi.it wrote:

> The complexity of handling this nonsense in libldap seems not worth the
> effort;

Ando, you are too kind to refer this extension as 'nonsense'.  It's worse than nonsense, it's crap.

The spec you pointed at says: "LDAP servers often place limits on the maximum number of attribute values that can be retrieved in a single query."  The spec fails to say why would a server want to do that.  I cannot think of any good reason to do precisely that.  While I can understand a desire to place limits on what clients can retrieve, I don't see why one would distinguish between a single query and a set of queries.  I can also see it desirable to allow a client to page through results (but that's a different matter, this is about forcing paging onto clients).

It is incredibly stupid to require clients to use multiple operations to obtain information for which the protocol allowed to be fetched in a single operation, and this is why I refer to this as "crap".

A major problem with paging, whether server-forced or at client-option, is result set coherency due to changes to the DIT between when various page operations are obtain.  This can lead to problems such a member of a group not appearing in the results.  While one might argue client might be able to reasonable deal with coherency issues, most client developers won't.  This will lead to a range of security issues creeping into directory applications.

So while I don't generally oppose introduction of paging extensions at the client option, I think the specifications should warn that the clients using them need to be designed to handle data coherency issues. However, most designers of such extensions seem to have put no thought into this issue at all.

I do generally oppose introduction of non-truly optional extensions in LDAP (and in other application protocols).   One would think that a requirement that extensions to a protocol be truely optional would go without saying, but in IETF we actually found it necessary to explicitly state this in the BCP governing LDAP extensions.

That BCP will effectively stop non-truly optional extension proposal as this from gaining Standard Track status in the IETF.

> I think we might consider working this around in proxy backends
> (much like we did for unsolicited paged results response in back-meta,
> ITS#6664, which could be added to back-ldap as well).

I would think it easier to reconfigure AD not to have the limits which cause this sort of paging.

> I don't think implementing something that requires a theoretically
> unbounded number of nested search requests for each attribute value that
> contains a range in each SearchResultEntry message makes sense.
> 
> The parallel with referrals is not appropriate, since referrals are part
> of LDAP specification; also, please note that automatic referral chasing
> is strongly discouraged unless the transport layer is protected (Section 6
> of RFC 4511).
> 
> p.

Comment 8 sgallagh@redhat.com 2011-01-26 18:59:32 UTC
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/26/2011 12:15 PM, masarati@aero.polimi.it wrote:
> 
> The complexity of handling this nonsense in libldap seems not worth the
> effort; I think we might consider working this around in proxy backends
> (much like we did for unsolicited paged results response in back-meta,
> ITS#6664, which could be added to back-ldap as well).
> 
> I don't think implementing something that requires a theoretically
> unbounded number of nested search requests for each attribute value that
> contains a range in each SearchResultEntry message makes sense.

I didn't suggest that it should be enabled by default. I was looking for
an option that could be set if a particular client needs it.

> The parallel with referrals is not appropriate, since referrals are part
> of LDAP specification; also, please note that automatic referral chasing
> is strongly discouraged unless the transport layer is protected (Section 6
> of RFC 4511).

Right, I really only meant in the broad sense of "You can enable this
option if you really need it, knowing that there are risks and the
potential for going through multiple additional requests". I just meant
there was precedent for adding features that required extra trips to the
server.

I also fully agree that the range extension is a crime against decency,
but unfortunately it's backed by the 800-pound gorilla that we can't
just ignore. I think if we're stuck with it, it makes sense to put the
code to deal with it in common OpenLDAP library code instead of forcing
every consumer to rewrite it. Because at the end of the day, pretty much
every LDAP client is going to need to be able to talk to Active
Directory at some point.



- -- 
Stephen Gallagher
RHCE 804006346421761

Delivering value year after year.
Red Hat ranks #1 in value among software vendors.
http://www.redhat.com/promo/vendor/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAk1Abw8ACgkQeiVVYja6o6MsjwCdFtTmbHk0yj16JT6Fz/ffU2QF
bcQAmwaV/aWcJL4dk3QFtYgD11OvZ1QV
=69z3
-----END PGP SIGNATURE-----

Comment 9 ando@openldap.org 2011-01-26 19:23:05 UTC
>
> On Jan 26, 2011, at 9:15 AM, masarati@aero.polimi.it wrote:
>
>> The complexity of handling this nonsense in libldap seems not worth the
>> effort;
>
> Ando, you are too kind to refer this extension as 'nonsense'.  It's worse
> than nonsense, it's crap.
>
> The spec you pointed at says: "LDAP servers often place limits on the
> maximum number of attribute values that can be retrieved in a single
> query."

I'd reword that as "One specific LDAP server places limits on the maximum
number of attribute values that can be retrieved in a single query."

> The spec fails to say why would a server want to do that.  I
> cannot think of any good reason to do precisely that.  While I can
> understand a desire to place limits on what clients can retrieve, I don't
> see why one would distinguish between a single query and a set of queries.
>  I can also see it desirable to allow a client to page through results
> (but that's a different matter, this is about forcing paging onto
> clients).
>
> It is incredibly stupid to require clients to use multiple operations to
> obtain information for which the protocol allowed to be fetched in a
> single operation, and this is why I refer to this as "crap".

:)

> A major problem with paging, whether server-forced or at client-option, is
> result set coherency due to changes to the DIT between when various page
> operations are obtain.  This can lead to problems such a member of a group
> not appearing in the results.  While one might argue client might be able
> to reasonable deal with coherency issues, most client developers won't.
> This will lead to a range of security issues creeping into directory
> applications.
>
> So while I don't generally oppose introduction of paging extensions at the
> client option, I think the specifications should warn that the clients
> using them need to be designed to handle data coherency issues. However,
> most designers of such extensions seem to have put no thought into this
> issue at all.

Well, I don't see paged results as a way to introduce more inconsistency
than a regular search operation, since nowhere in the specs I see that the
collection of the entries returned by a search are consistent with the
entries matching the search request at a specific time (e.g. when the
operation was initiated or so).  As a consequence, clients need be aware
of the fact that search results can be inherently inconsistent, paging or
not.

> I do generally oppose introduction of non-truly optional extensions in
> LDAP (and in other application protocols).   One would think that a
> requirement that extensions to a protocol be truely optional would go
> without saying, but in IETF we actually found it necessary to explicitly
> state this in the BCP governing LDAP extensions.
>
> That BCP will effectively stop non-truly optional extension proposal as
> this from gaining Standard Track status in the IETF.
>
>> I think we might consider working this around in proxy backends
>> (much like we did for unsolicited paged results response in back-meta,
>> ITS#6664, which could be added to back-ldap as well).
>
> I would think it easier to reconfigure AD not to have the limits which
> cause this sort of paging.

Of course.  The whole discussion about this issue (and others) and often
the thrust that pushes the development of many features in the proxy
backends (and other features, like slapo-rwm, slapo-translucent and more)
is the practical impossibility to affect the configuration of one or more
remote servers.

p.

Comment 10 Kurt Zeilenga 2011-01-26 20:07:32 UTC
On Jan 26, 2011, at 11:23 AM, masarati@aero.polimi.it wrote:

>> 
>> On Jan 26, 2011, at 9:15 AM, masarati@aero.polimi.it wrote:
>> 
>>> The complexity of handling this nonsense in libldap seems not worth the
>>> effort;
>> 
>> Ando, you are too kind to refer this extension as 'nonsense'.  It's worse
>> than nonsense, it's crap.
>> 
>> The spec you pointed at says: "LDAP servers often place limits on the
>> maximum number of attribute values that can be retrieved in a single
>> query."
> 
> I'd reword that as "One specific LDAP server places limits on the maximum
> number of attribute values that can be retrieved in a single query."
> 
>> The spec fails to say why would a server want to do that.  I
>> cannot think of any good reason to do precisely that.  While I can
>> understand a desire to place limits on what clients can retrieve, I don't
>> see why one would distinguish between a single query and a set of queries.
>> I can also see it desirable to allow a client to page through results
>> (but that's a different matter, this is about forcing paging onto
>> clients).
>> 
>> It is incredibly stupid to require clients to use multiple operations to
>> obtain information for which the protocol allowed to be fetched in a
>> single operation, and this is why I refer to this as "crap".
> 
> :)
> 
>> A major problem with paging, whether server-forced or at client-option, is
>> result set coherency due to changes to the DIT between when various page
>> operations are obtain.  This can lead to problems such a member of a group
>> not appearing in the results.  While one might argue client might be able
>> to reasonable deal with coherency issues, most client developers won't.
>> This will lead to a range of security issues creeping into directory
>> applications.
>> 
>> So while I don't generally oppose introduction of paging extensions at the
>> client option, I think the specifications should warn that the clients
>> using them need to be designed to handle data coherency issues. However,
>> most designers of such extensions seem to have put no thought into this
>> issue at all.
> 
> Well, I don't see paged results as a way to introduce more inconsistency
> than a regular search operation, since nowhere in the specs I see that the
> collection of the entries returned by a search are consistent with the
> entries matching the search request at a specific time (e.g. when the
> operation was initiated or so).  As a consequence, clients need be aware
> of the fact that search results can be inherently inconsistent, paging or
> not.

As LDAP is defined as an access protocol to an X.500 directory, it inherits some data consistency expectations.  In generally expected that entry-level access have ACID properties.  That is, each successful modify operation creates a new DIT state and a read operation returns information consistent with a particular state.  That is, say the DIT has state X and there is modify operation that produces state Y and a concurrent issued read operation (and no other operations), the read operation will return either state X or state Y depending on whether it is processed before or after the modify operation.  The read ought not provide some intermediate state.

A search operation acts upon a collection of entries, there is a possibility of inconsistencies between different entries (but not within them), especially in distributed DITs.

However, as we're talking about values of an attribute of a particular entry, this is a non-issue in this discussion.