[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: RE: Re: (ITS#9017) Improving performance of commit sync in Windows



<html xmlns:o=3D"urn:schemas-microsoft-com:office:office" xmlns:w=3D"urn:sc=
hemas-microsoft-com:office:word" xmlns:m=3D"http://schemas.microsoft.com/of=
fice/2004/12/omml" xmlns=3D"http://www.w3.org/TR/REC-html40";><head><meta ht=
tp-equiv=3DContent-Type content=3D"text/html; charset=3Dutf-8"><meta name=
=3DGenerator content=3D"Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0cm;
	margin-bottom:.0001pt;
	font-size:11.0pt;
	font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:#954F72;
	text-decoration:underline;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-family:"Calibri",sans-serif;}
@page WordSection1
	{size:612.0pt 792.0pt;
	margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
	{page:WordSection1;}
--></style></head><body lang=3DEN-CA link=3Dblue vlink=3D"#954F72"><div cla=
ss=3DWordSection1><p class=3DMsoNormal>Here is a pull request for these win=
dows upgrades (I know it won=E2=80=99t be applied directly in GitHub, but s=
eems a convenient host for the patches):</p><p class=3DMsoNormal><a href=3D=
"https://github.com/LMDB/lmdb/pull/18";><span style=3D'color:#954F72'>https:=
//github.com/LMDB/lmdb/pull/18</span></a><o:p></o:p></p><p class=3DMsoNorma=
l><o:p>&nbsp;</o:p></p><p class=3DMsoNormal>The first commit is basically t=
he changes mentioned in this thread for improving sync performance by using=
 write-through writes instead FlushFileBuffers/FlushViewOfFile, reapplied t=
o the mdb.master branch, with conflicts resolved.</p><p class=3DMsoNormal><=
a href=3D"https://github.com/LMDB/lmdb/commit/204d1710d7a34eff142f2654d7678=
182e1b9e054.patch"><span style=3D'color:#954F72'>https://github.com/LMDB/lm=
db/commit/204d1710d7a34eff142f2654d7678182e1b9e054.patch</span></a></p><p c=
lass=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal>The second commi=
t adds support for an option for opening map files by providing the mapsize=
 as the file size (MaximumSize) to NtCreateSection (which also improves per=
formance and prevents freezing caused by file growth). I implemented this a=
s a compiler preprocessor define, as requested. However, I would note that =
I would still prefer this to be a runtime option. When bundling LMDB in a n=
ode package, there is no way to parameterize the compiler options (that I k=
now of) for distribution, and it would be nice to have this an option for d=
ifferent users of the node package.</p><p class=3DMsoNormal><a href=3D"http=
s://github.com/LMDB/lmdb/commit/6f94bb1da1812c1ddb68e33448c9cd381674b02d.pa=
tch"><span style=3D'color:#954F72'>https://github.com/LMDB/lmdb/commit/6f94=
bb1da1812c1ddb68e33448c9cd381674b02d.patch</span></a></p><p class=3DMsoNorm=
al><o:p>&nbsp;</o:p></p><p class=3DMsoNormal>However, in trying to test the=
 mdb.master branch code with our application/server (we were previously usi=
ng the 0.9 branch), there was a regression causing it to crash pretty much =
any time we attempted to write to a db that was larger than 2GB. After some=
 considerable testing and investigation (which is why it took a while to ge=
t this finished), it seems the cause is the use of the off_t type for file =
pointer/positions, which appears to be a 32-bit signed integer compiling on=
 Windows. This overflows for dbs over 2GB and causes references that crash =
the process. Replacing all the off_t types with size_t (unsigned 64-bit), f=
ixed this issue. I am not sure if this is the right replacement type. I thi=
nk this could also be addressed with compiler option for changing the file =
offset type size, but that seems like a hazardous hoop to jump through. But=
 the third commit, replacing the off_t with size_t definitely fixed the pro=
blem in our application.</p><p class=3DMsoNormal><a href=3D"https://github.=
com/LMDB/lmdb/commit/45cf4b6ede38565cfab1c40d0d77961a0cb22b9e.patch"><span =
style=3D'color:#954F72'>https://github.com/LMDB/lmdb/commit/45cf4b6ede38565=
cfab1c40d0d77961a0cb22b9e.patch</span></a></p><p class=3DMsoNormal><o:p>&nb=
sp;</o:p></p><p class=3DMsoNormal>With these three commits we are seeing so=
lid stability and performance on Windows with the master branch. Thanks for=
 considering this!</p><p class=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3D=
MsoNormal>Thanks,<br>Kris</p><p class=3DMsoNormal><o:p>&nbsp;</o:p></p><div=
 style=3D'mso-element:para-border-div;border:none;border-top:solid #E1E1E1 =
1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=3DMsoNormal style=3D'border:none;=
padding:0cm'><b>From: </b><a href=3D"mailto:kriszyp@gmail.com";>kriszyp@gmai=
l.com</a><br><b>Sent: </b>February 18, 2020 1:06 PM<br><b>To: </b><a href=
=3D"mailto:hyc@symas.com";>Howard Chu</a><br><b>Cc: </b><a href=3D"mailto:op=
enldap-its@openldap.org">openldap-its@OpenLDAP.org</a><br><b>Subject: </b>R=
E: Re: (ITS#9017) Improving performance of commit sync in Windows</p></div>=
<p class=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal>Looking into=
 the APIs, I would guess that CreateFileMapping=E2=80=99s dwMaximumSizeHigh=
 and dwMaximumSizeLow arguments are mapped to NtCreateSection=E2=80=99s Max=
imumSize argument (4<sup>th</sup> argument). In LMDB=E2=80=99s 0.9 CreateFi=
leMapping call, the size is provided (since it is required), whereas NtCrea=
teSection has MaximumSize as optional, and in LMDB (mdb.master), it is not =
provided. Passing the size in (with MaximumSize argument) to NtCreateSectio=
n seems to result in the same behavior as CreateFileMapping of preallocatin=
g a file size equivalent to the max/mapping size, rather than dynamically g=
rowing.<o:p></o:p></p><p class=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3D=
MsoNormal>It is challenging to readily test for the performance issues; it =
took a fair bit of effort and load testing to trigger the long-duration per=
formance/freezing issues we had seen before on Azure. However, in some more=
 isolated tests, providing a MaximumSize to NtCreateSection does seem to pe=
rform better than leaving it unspecified (specifically the system CPU usage=
 seems to about be about twice as much with dynamic growth than fixed file =
size), and the performance seems about the same as CreateFileMapping. And I=
 think we both had suspected that it was likely due to the overhead or inef=
ficiencies in how Windows was dynamically growing the (size of the) mapped =
file, and having Windows used a predefined fixed file size, whether through=
 NtCreateSection or CreateFileMapping, probably would prevent these slowdow=
n issues.<o:p></o:p></p><p class=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=
=3DMsoNormal>But, whether we use NtCreateSection or CreateFileMapping, we a=
re still faced with the same trade-off of whether to use dynamic Windows-co=
ntrolled file growth (convenient since the map size can be set arbitrarily =
large), or using pre-specified file size (to avoid the slowing/freezing tha=
t seems to sometimes result from dynamic Windows growth). So I assume it wo=
uld still be reasonable to offer a compile time option, so users could opt-=
in to using a fixed file size (provide the MaximumSize to NtCreateSection)?=
 Fortunately this would be a trivial addition, just a few lines of code to =
use the preprocessor switch to supply this argument.<o:p></o:p></p><p class=
=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal><a href=3D"http://un=
documented.ntinternals.net/index.html?page=3DUserMode%2FUndocumented%20Func=
tions%2FNT%20Objects%2FSection%2FNtMapViewOfSection.html">http://undocument=
ed.ntinternals.net/index.html?page=3DUserMode%2FUndocumented%20Functions%2F=
NT%20Objects%2FSection%2FNtMapViewOfSection.html</a><o:p></o:p></p><p class=
=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal>Thanks,<br>Kris<o:p>=
</o:p></p><p class=3DMsoNormal><o:p>&nbsp;</o:p></p><div style=3D'border:no=
ne;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=3DMso=
Normal><b>From: </b><a href=3D"mailto:hyc@symas.com";>Howard Chu</a><br><b>S=
ent: </b>February 18, 2020 8:15 AM<br><b>To: </b><a href=3D"mailto:kriszyp@=
gmail.com">Kris Zyp</a><br><b>Cc: </b><a href=3D"mailto:openldap-its@openld=
ap.org">openldap-its@OpenLDAP.org</a><br><b>Subject: </b>Re: (ITS#9017) Imp=
roving performance of commit sync in Windows<o:p></o:p></p></div><p class=
=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal>Kris Zyp wrote:<o:p>=
</o:p></p><p class=3DMsoNormal>&gt; Yes, I will work on upgrading this patc=
h to 1.0.<o:p></o:p></p><p class=3DMsoNormal>&gt; <o:p></o:p></p><p class=
=3DMsoNormal>&gt; However, I believe in order to realize optimal Windows pe=
rformance without regression/slow-downs, I will also need to disable the us=
e of NTDLL (and use direct<o:p></o:p></p><p class=3DMsoNormal>&gt; CreateFi=
leMapping memory maps instead), as mentioned&nbsp; http://www.openldap.org/=
lists/openldap-bugs/201902/msg00009.html. You had mentioned in that thread =
that<o:p></o:p></p><p class=3DMsoNormal>&gt; the NTDLL code wasn't destined=
 for release, is that still the case (it is still in the mdb.master branch)=
? If you are intending to release (1.0) with the<o:p></o:p></p><p class=3DM=
soNormal>&gt; NTDLL-based memory maps (which is understandable, can certain=
ly be an easier path for Windows users getting started), I will also need t=
o add the compiler<o:p></o:p></p><p class=3DMsoNormal>&gt; option to disabl=
e it. Anyway, I will work on creating a PR with both these patches.<o:p></o=
:p></p><p class=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal>Might=
 be interesting to investigate why the behavior with NTDLL is so much slowe=
r on Azure. Since<o:p></o:p></p><p class=3DMsoNormal>the Win32 APIs are imp=
lemented on top of the NT APIs, you would expect them to perform identicall=
y.<o:p></o:p></p><p class=3DMsoNormal>Perhaps there's some additional flag =
that the Win32 API is passing to NT that is needed on Azure.<o:p></o:p></p>=
<p class=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal>&gt; <o:p></=
o:p></p><p class=3DMsoNormal>&gt; Thanks,<o:p></o:p></p><p class=3DMsoNorma=
l>&gt; Kris<o:p></o:p></p><p class=3DMsoNormal>&gt; <o:p></o:p></p><p class=
=3DMsoNormal>&gt; On Mon, Feb 17, 2020 at 8:14 AM Howard Chu &lt;hyc@symas.=
com &lt;mailto:hyc@symas.com&gt;&gt; wrote:<o:p></o:p></p><p class=3DMsoNor=
mal>&gt; <o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp; K=
ris Zyp wrote:<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nb=
sp; &gt; Sorry to keep pestering, but just pinging about this patch again, =
as I still think this fix could benefit windows users. And at this point, I=
 think I can<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp=
; say we<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp; &g=
t; have tested it pretty well, running on our servers for almost a year :).=
<o:p></o:p></p><p class=3DMsoNormal>&gt; <o:p></o:p></p><p class=3DMsoNorma=
l>&gt;&nbsp;&nbsp;&nbsp;&nbsp; Looks like this patch is against the 0.9 rel=
ease branch. I hit a bunch of conflicts<o:p></o:p></p><p class=3DMsoNormal>=
&gt;&nbsp;&nbsp;&nbsp;&nbsp; trying to apply it to mdb.master. We'll be sto=
pping work on 0.9 soon, and getting<o:p></o:p></p><p class=3DMsoNormal>&gt;=
&nbsp;&nbsp;&nbsp;&nbsp; LMDB 1.0 out the door finally, so can you please v=
erify that your changes will work<o:p></o:p></p><p class=3DMsoNormal>&gt;&n=
bsp;&nbsp;&nbsp;&nbsp; on mdb.master as well?<o:p></o:p></p><p class=3DMsoN=
ormal>&gt; <o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp;=
 &gt; Thanks,<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbs=
p; &gt; Kris<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp=
; &gt;<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp; &gt;=
 On Wed, Sep 18, 2019 at 12:56 PM Kris Zyp &lt;kriszyp@gmail.com &lt;mailto=
:kriszyp@gmail.com&gt; &lt;mailto:kriszyp@gmail.com &lt;mailto:kriszyp@gmai=
l.com&gt;&gt;&gt; wrote:<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp=
;&nbsp;&nbsp; &gt;<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp=
;&nbsp; &gt;&nbsp; &nbsp; &nbsp;Checking on this again, is this still a pos=
sibility for merging into LMDB? This fix is still working great (improved p=
erformance) on our systems.<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&n=
bsp;&nbsp;&nbsp; &gt;&nbsp; &nbsp; &nbsp;Thanks,<o:p></o:p></p><p class=3DM=
soNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp; &gt;&nbsp; &nbsp; &nbsp;Kris<o:p></o:=
p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp; &gt;<o:p></o:p></p>=
<p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp; &gt;&nbsp; &nbsp; &nbsp;O=
n Mon, Jun 17, 2019 at 1:04 PM Kris Zyp &lt;kriszyp@gmail.com &lt;mailto:kr=
iszyp@gmail.com&gt; &lt;mailto:kriszyp@gmail.com &lt;mailto:kriszyp@gmail.c=
om&gt;&gt;&gt; wrote:<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&n=
bsp;&nbsp; &gt;<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&n=
bsp; &gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Is this still being considered/r=
eviewed? Let me know if there are any other changes you would like me to ma=
ke. This patch has continued to yield<o:p></o:p></p><p class=3DMsoNormal>&g=
t;&nbsp;&nbsp;&nbsp;&nbsp; &gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;significan=
t and reliable performance improvements for us, and seems like it would be =
nice for this to be available for other Windows users.<o:p></o:p></p><p cla=
ss=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp; &gt;<o:p></o:p></p><p class=3DM=
soNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp; &gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp=
;On Fri, May 3, 2019 at 3:52 PM Kris Zyp &lt;kriszyp@gmail.com &lt;mailto:k=
riszyp@gmail.com&gt; &lt;mailto:kriszyp@gmail.com &lt;mailto:kriszyp@gmail.=
com&gt;&gt;&gt; wrote:<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&=
nbsp;&nbsp; &gt;<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&=
nbsp; &gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;For the sake of p=
utting this in the email thread (other code discussion in GitHub), here is =
the latest squashed commit of the proposed patch (with<o:p></o:p></p><p cla=
ss=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp; &gt;&nbsp; &nbsp; &nbsp; &nbsp;=
 &nbsp; &nbsp; &nbsp;the on-demand, retained overlapped array to reduce re-=
malloc and opening event handles):<o:p></o:p></p><p class=3DMsoNormal>&gt;&=
nbsp;&nbsp;&nbsp;&nbsp; &gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp=
;https://github.com/kriszyp/node-lmdb/commit/726a9156662c703bf3d453aab75ee2=
22072b990f<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp; =
&gt;<o:p></o:p></p><p class=3DMsoNormal>&gt; <o:p></o:p></p><p class=3DMsoN=
ormal>&gt; <o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp;=
 -- <o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;=
 -- Howard Chu<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nb=
sp; &nbsp; CTO, Symas Corp.&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;http://=
www.symas.com<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbs=
p; &nbsp; Director, Highland Sun&nbsp; &nbsp; &nbsp;http://highlandsun.com/=
hyc/<o:p></o:p></p><p class=3DMsoNormal>&gt;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;=
 Chief Architect, OpenLDAP&nbsp; http://www.openldap.org/project/<o:p></o:p=
></p><p class=3DMsoNormal>&gt; <o:p></o:p></p><p class=3DMsoNormal><o:p>&nb=
sp;</o:p></p><p class=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal=
>-- <o:p></o:p></p><p class=3DMsoNormal>&nbsp;&nbsp;-- Howard Chu<o:p></o:p=
></p><p class=3DMsoNormal>&nbsp; CTO, Symas Corp.&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; http://www.symas.com<o:p></o:p></p><p cl=
ass=3DMsoNormal>&nbsp; Director, Highland Sun&nbsp;&nbsp;&nbsp;&nbsp; http:=
//highlandsun.com/hyc/<o:p></o:p></p><p class=3DMsoNormal>&nbsp; Chief Arch=
itect, OpenLDAP&nbsp; http://www.openldap.org/project/<o:p></o:p></p><p cla=
ss=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal><o:p>&nbsp;</o:p><=
/p></div></body></html>=