[Date Prev][Date Next] [Chronological] [Thread] [Top]

The Case for Learned Index Structures

To: "OpenLDAP-devel@openldap.org" <OpenLDAP-devel@openldap.org>
Subject: The Case for Learned Index Structures
From: Howard Chu <hyc@symas.com>
Date: Mon, 11 Dec 2017 15:52:37 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0 SeaMonkey/2.53a1

This has been making some waves today on social media:

https://www.arxiv-vanity.com/papers/1712.01208v1/

For now, it's only a novelty. Just like perfect hash functions. It assumes astatic data set being used in read-only fashion, so it's unsuitable for adirectory or database that serves ongoing modifications. It also assumes anentire data set fits in RAM, which is generally not true for databaseapplications. In particular, the "fast" case of using highly parallel GPUsassumes everything fits inside GPU RAM, which is even more tightly constrainedthan server main memory.

It's axiomatic that if you have advance knowledge about theshape/characteristics of a dataset, you can construct a dedicated mappingfunction for that dataset that is perfectly optimal, and outperforms anygeneral-purpose mapping. That's kind of the point of general-purpose mappings- they are general. There are plenty of use cases where this fact may beuseful. In LDAP and any database system ingesting data in realtime, thesefindings are irrelevant since advance knowledge of the dataset doesn't exist.


--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Index(es):
- Chronological
- Thread