Full_Name: Leo Mocofan Version: 2.4.31 OS: Debian Testing URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (129.128.11.107) We are currently testing Linux, OpenBSD and FreeBSD to run our LDAP service. Our tests showed that Linux has great performance numbers in most of the areas but we found one test that is very slow. The database has 1 million entries and is 4.5 GB in size. Importing the whole dababase using slapadd takes minutes on all platforms. When we tried to import the database using ldapadd we noticed a huge difference: FreeBSD - 42 min, OpenBSD - 59 min, Linux - 863 min. Changing file systems for backend in Linux had no effect. Using memory based file system for the backend in Linux the time for ldapadd dropped to 3 min. Strace for ldapadd shows: write(1, "adding new entry \"uid=noname,o"..., 60) = 60 time(NULL) = 1365202241 getrusage(RUSAGE_SELF, {ru_utime={1, 684105}, ru_stime={0, 212013}, ...}) = 0 time(NULL) = 1365202241 times({tms_utime=168, tms_stime=21, tms_cutime=0, tms_cstime=0}) = 1719044914 getrusage(RUSAGE_SELF, {ru_utime={1, 684105}, ru_stime={0, 212013}, ...}) = 0 time(NULL) = 1365202241 times({tms_utime=168, tms_stime=21, tms_cutime=0, tms_cstime=0}) = 1719044914 write(4, "\27\27\27\27\17uJ,\324\324\324\324"..., 1237) = 1237 poll([{fd=4, events=POLLIN|POLLPRI}], 1, 100) = 0 (Timeout) poll([{fd=4, events=POLLIN|POLLPRI}], 1, 100) = 0 (Timeout) poll([{fd=4, events=POLLIN|POLLPRI}], 1, 100) = 0 (Timeout) poll([{fd=4, events=POLLIN|POLLPRI}], 1, 100) = 1 ([{fd=4, revents=POLLIN}]) Strace for one thread of slapd: write(12, "\27\27\27\27\17uJ,\324\324\324\324"..., 117) = 117 time([1365202241]) = 1365202241 sendto(4, "<167>Apr 5 16:50:41 slapd[4513]"..., 79, MSG_NOSIGNAL, NULL, 0) = 79 futex(0x7f36f8fd6724, FUTEX_WAIT_PRIVATE, 9166, NULL) = 0 futex(0x7f36f8fd66f8, FUTEX_WAKE_PRIVATE, 1) = 0 time(NULL) = 1365202242 read(12, "", 5) = 0 write(6, "0", 1) = 1 futex(0x7f36f8fd6724, FUTEX_WAIT_PRIVATE, 9169, NULL Strace for another thread of slapd: futex(0x7f36f8fd66f8, FUTEX_WAKE_PRIVATE, 1) = 1 time(NULL) = 1365202241 epoll_wait(7, {{EPOLLIN, {u32=4177193844, u64=139874082155380}}}, 1024, 150000) = 1 read(5, "0", 8192) = 1 time(NULL) = 1365202241 epoll_wait(7, {{EPOLLIN, {u32=4177193872, u64=139874082155408}}}, 1024, 150000) = 1 epoll_ctl(7, EPOLL_CTL_MOD, 12, {0, {u32=4177193872, u64=139874082155408}}) = 0 futex(0x7f36f8fd6724, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f36f8fd6720, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 time(NULL) = 1365202242 epoll_wait(7, {{EPOLLIN, {u32=4177193844, u64=139874082155380}}}, 1024, 150000) = 1 read(5, "0", 8192) = 1 time(NULL) = 1365202242 epoll_wait(7, {}, 1024, 150000) = 0 time(NULL) = 1365202392 epoll_wait(7, Ldapadd is stuck in poll system call, while one thread of slapd is waiting on epoll_wait(7, After couple of seconds when apparently nothing happens, then everything starts working again. Changing the database backend from to bdb to hdb and mdb doesn't seem to have an effect. We then tried to force SLAP_EVENT_WAIT to use select instead of epoll_wait on Linux and ldapadd is still much slower on Linux. We are runnin Debian testing as of Apr 5, 2013, with glibc-2.13, linux-3.2.41, libdb-5.1.29, openldap-2.4.31.
--On Friday, April 05, 2013 11:36 PM +0000 lm5@ualberta.ca wrote: > Full_Name: Leo Mocofan > Version: 2.4.31 > OS: Debian Testing > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (129.128.11.107) If you are going to test back-mdb, then you need to use a current OpenLDAP release (2.4.35). In any case, I don't see a bug report here? It sounds more like you're asking for tuning advice? I would ask what filesystem you are using (ext2? ext4?) etc. For ext4, I had to modify dirty_bytes to keep the Debian/Ubuntu flush process from killing load performance. --Quanah -- Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
--On Friday, April 05, 2013 11:46 PM +0000 quanah@zimbra.com wrote: > --On Friday, April 05, 2013 11:36 PM +0000 lm5@ualberta.ca wrote: > >> Full_Name: Leo Mocofan >> Version: 2.4.31 >> OS: Debian Testing >> URL: ftp://ftp.openldap.org/incoming/ >> Submission from: (NULL) (129.128.11.107) > > If you are going to test back-mdb, then you need to use a current > OpenLDAP release (2.4.35). > > In any case, I don't see a bug report here? It sounds more like you're > asking for tuning advice? I would ask what filesystem you are using > (ext2? ext4?) etc. For ext4, I had to modify dirty_bytes to keep the > Debian/Ubuntu flush process from killing load performance. Oh, one other note -- I advise avoiding the debian/ubuntu build versions of BDB, if that is what you are using. They do not build BDB with the correct configure options, seriously impacting performance. --Quanah -- Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Hi Quanah, I have started doing tests with Gentoo Linux running openldap 2.4.35, glibc 2.16 and linux kernel 3.4, 3.8 and 3.9-rc5. I have tried ext2, ext4, reiserfs and jfs. Ldapadd is still very slow compared to BSD regardless of the openldap backend. Also a correction to the initial report: using a memory based file system, ldapadd the whole database took 22 min. Leo On Fri, Apr 5, 2013 at 5:48 PM, Quanah Gibson-Mount <quanah@zimbra.com>wrote: > --On Friday, April 05, 2013 11:46 PM +0000 quanah@zimbra.com wrote: > > --On Friday, April 05, 2013 11:36 PM +0000 lm5@ualberta.ca wrote: >> >> Full_Name: Leo Mocofan >>> Version: 2.4.31 >>> OS: Debian Testing >>> URL: ftp://ftp.openldap.org/**incoming/<ftp://ftp.openldap.org/incoming/> >>> Submission from: (NULL) (129.128.11.107) >>> >> >> If you are going to test back-mdb, then you need to use a current >> OpenLDAP release (2.4.35). >> >> In any case, I don't see a bug report here? It sounds more like you're >> asking for tuning advice? I would ask what filesystem you are using >> (ext2? ext4?) etc. For ext4, I had to modify dirty_bytes to keep the >> Debian/Ubuntu flush process from killing load performance. >> > > Oh, one other note -- I advise avoiding the debian/ubuntu build versions > of BDB, if that is what you are using. They do not build BDB with the > correct configure options, seriously impacting performance. > > > --Quanah > > -- > > Quanah Gibson-Mount > Sr. Member of Technical Staff > Zimbra, Inc > A Division of VMware, Inc. > -------------------- > Zimbra :: the leader in open source messaging and collaboration > >
--On Friday, April 05, 2013 8:52 PM -0600 - lidutu <lm5@ualberta.ca> wrote: > > > > Hi Quanah, > > I have started doing tests with Gentoo Linux running openldap 2.4.35, > glibc 2.16 and linux kernel 3.4, 3.8 and 3.9-rc5. I have tried ext2, > ext4, reiserfs and jfs. Ldapadd is still very slow compared to BSD > regardless of the openldap backend. > > > Also a correction to the initial report: using a memory based file > system, ldapadd the whole database took 22 min. So what is happening on the slapd server while things are paused? Also, if you use an ldapi:/// socket rather than a tcp/ip socket, do you see the same issue? --Quanah -- Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
> So what is happening on the slapd server while things are paused? > > Also, if you use an ldapi:/// socket rather than a tcp/ip socket, do you > see the same issue? > A have changed to use the ldapi:/// socket and there is no improvement. It seems that one thread of slapd is repeatedly waiting on epoll_wait. Strace prints epoll_wait(6, and then pauses for couple of seconds. Iotop shows to slapd threads writing to disk. Right after I start ldapadd, they write to disk at approximately 60MB/s each. Over time, they go down to 2-3 MB/s each. TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND 15264 be/4 ldap 0.00 B/s 2.81 M/s 0.00 % 99.99 % slapd -u ldap -g ldap -f /etc/openldap/slapd.conf -h ldapi://%2fvar%2frun%2fopenldap%2fslapd.sock 15255 be/4 ldap 0.00 B/s 3.00 M/s 0.00 % 19.48 % slapd -u ldap -g ldap -f /etc/openldap/slapd.conf -h ldapi://%2fvar%2frun%2fopenldap%2fslapd.sock I took an strace for all the process and threads involved over a 2 minute period. Process slapd - main thread: futex(0x7f2efa38e9d0, FUTEX_WAIT, 15251, NULL Process slapd - thread: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 57.08 1.639877 474 3457 268 futex 42.54 1.222208 1076 1136 fdatasync 0.17 0.004913 0 272650 writev 0.11 0.003194 3 1137 pwrite 0.06 0.001736 0 272649 lseek 0.02 0.000530 0 4544 1136 read 0.01 0.000282 0 2273 write 0.00 0.000132 0 2273 sendto 0.00 0.000065 0 1136 open 0.00 0.000061 0 1136 stat 0.00 0.000041 0 1136 epoll_ctl 0.00 0.000011 0 1136 close 0.00 0.000000 0 116 mprotect 0.00 0.000000 0 2272 fcntl 0.00 0.000000 0 1136 getuid 0.00 0.000000 0 1136 getppid 0.00 0.000000 0 1136 gettid ------ ----------- ----------- --------- --------- ---------------- 100.00 2.873050 570459 1404 total Process slapd - thread: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 50.67 1.128008 399 2828 213 futex 48.96 1.089974 1161 939 fdatasync 0.15 0.003311 0 226607 writev 0.13 0.002928 3 939 pwrite 0.07 0.001544 0 226607 lseek 0.01 0.000214 0 1879 write 0.01 0.000139 0 1879 sendto 0.00 0.000065 0 940 open 0.00 0.000046 0 3760 940 read 0.00 0.000029 0 94 mprotect 0.00 0.000021 0 940 close 0.00 0.000021 0 940 getppid 0.00 0.000019 0 940 getuid 0.00 0.000015 0 940 epoll_ctl 0.00 0.000014 0 940 stat 0.00 0.000000 0 1880 fcntl 0.00 0.000000 0 940 gettid ------ ----------- ----------- --------- --------- ---------------- 100.00 2.226348 473992 1153 total Process slapd - thread: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- -nan 0.000000 0 1386 read -nan 0.000000 0 13861 sendto -nan 0.000000 0 3866 294 futex -nan 0.000000 0 2772 epoll_wait -nan 0.000000 0 1386 epoll_ctl ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000000 23271 294 total Process ldapadd: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 99.19 0.056533 18 3194 poll 0.48 0.000276 0 4866 read 0.32 0.000183 0 6846 write 0.00 0.000000 0 1 restart_syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.056992 14907 total
--On Saturday, April 06, 2013 1:07 PM -0600 - lidutu <lm5@ualberta.ca> wrote: > > > > > > > > So what is happening on the slapd server while things are paused? > > Also, if you use an ldapi:/// socket rather than a tcp/ip socket, do you > see the same issue? > > > > > A have changed to use the ldapi:/// socket and there is no improvement. > > > It seems that one thread of slapd is repeatedly waiting on epoll_wait. > Strace prints > epoll_wait(6, > > and then pauses for couple of seconds. > > > > Iotop shows to slapd threads writing to disk. Right after I start > ldapadd, they write to disk at approximately 60MB/s each. Over time, they > go down to 2-3 MB/s each. > > TID PRIO USER DISK READ DISK WRITE > SWAPIN IO COMMAND > 15264 be/4 ldap 0.00 B/s 2.81 M/s 0.00 % 99.99 % > slapd -u ldap -g ldap -f /etc/openldap/slapd.conf -h > ldapi://%2fvar%2frun%2fopenldap%2fslapd.sock > 15255 be/4 ldap 0.00 B/s 3.00 M/s 0.00 % 19.48 % > slapd -u ldap -g ldap -f /etc/openldap/slapd.conf -h > ldapi://%2fvar%2frun%2fopenldap%2fslapd.sock I noticed something similar on ubuntu10 as well. I upgraded my server to ubuntu12, and I also set /proc/sys/vm/dirty_ratio to 90 and /proc/sys/vm/dirty_expire_centisecs to 60000, and that helped significantly with back-mdb using current OpenLDAP master. An ldapadd of almost 3.5 million users takes approximately 4 hours now for my dataset vs 16. so 4x faster. --Quanah -- Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
> > I noticed something similar on ubuntu10 as well. I upgraded my server to > ubuntu12, and I also set /proc/sys/vm/dirty_ratio to 90 and > /proc/sys/vm/dirty_expire_centisecs to 60000, and that helped significantly > with back-mdb using current OpenLDAP master. An ldapadd of almost 3.5 > million users takes approximately 4 hours now for my dataset vs 16. so 4x > faster. > Thank you for your suggestions. I tried these two settings and ldapadd wrote 3 GB in 8 hours (out of 4.3 GB). How big is your database ? Have you considered running the database from a ramdisk ? Using ramfs in Linux, I got the fastest times on all the tests I tried. Leo Mocofan
--On Monday, April 08, 2013 8:31 PM -0600 Leo Mocofan <lm5@ualberta.ca> wrote: > > > > > > > I noticed something similar on ubuntu10 as well. I upgraded my server to > ubuntu12, and I also set /proc/sys/vm/dirty_ratio to 90 and > /proc/sys/vm/dirty_expire_centisecs to 60000, and that helped > significantly > with back-mdb using current OpenLDAP master. An ldapadd of almost 3.5 > million users takes approximately 4 hours now for my dataset vs 16. so > 4x > faster. > > > > > Thank you for your suggestions. I tried these two settings and ldapadd > wrote 3 GB in 8 hours (out of 4.3 GB). How big is your database ? My DB is 5.6GB in size after load is complete. --Quanah -- Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration