This summer I ran into a problem with conversions that resulted in zombie instances of the rundll32.exe child processes being left around after certain errors. A fix was made to the v5 version of the code and it looked like the issue had been resolved. (See:
http://www.essentialobjects.com/forum/postst8426_EOPDF-Out-of-Memory-and-Zombie-Rundll32exe.aspx) However it appears that the issue has returned from the dead... with a mutuation!
While using EO.Pdf v6.0.21.2, we process a 40MB HTML file that results in the rundll32.exe child process consuming roughly 1.8GB of memory (and over 3GB in the 64-bit service) but successfully completing the conversion. If we pass the exact same 40MB HTML file through again in the process, a second rundll32.exe child instance is spawned but then it dies almost immediately with the following error:
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at System.String.GetStringForStringBuilder(String value, Int32 startIndex, Int32 length, Int32 capacity)
at System.Text.StringBuilder.GetNewString(String currentString, Int32 requiredLength)
at System.Text.StringBuilder.Append(String value)
at System.Text.StringBuilder.AppendFormat(IFormatProvider provider, String format, Object[] args)
at System.String.Format(IFormatProvider provider, String format, Object[] args)
at EO.Pdf.Internal.agh.a(HtmlToPdfOptions A_0, String A_1, Boolean A_2)
at EO.Pdf.Internal.agh.b(HtmlToPdfOptions A_0, String A_1, Boolean A_2)
at EO.Pdf.Internal.agh.a(asq A_0)
at EO.Pdf.Internal.ma.e.a(Byte[] A_0)
at EO.Pdf.Internal.apq.b(BinaryReader A_0)
at EO.Pdf.Internal.ma.a(apq A_0)
at EO.Pdf.HtmlToPdfSession.a(apq A_0)
at EO.Pdf.HtmlToPdf.ConvertHtml(String html, PdfDocument doc, HtmlToPdfOptions options)
...
If we pass the same 40MB HTML file through a third time, another instance of rundll32.exe is spawned and the file converts w/out an error - but now we have 2 child processes using 1.8GB of memory + the service using up to 4GB of memory.
If we pass the same 40MB HTML file through a fourth and fifth time, it's like the 2nd pass in that rundll32.exe instances are spawned but die off almost immediately leaving the 2 1.8GB child processes from before still laying around. This seems to block all further requests on this file. I did not let the app sleep loop to see if the child processes eventually are cleaned up or not but they do seem to go away when the process exits (unlike the first zombie case).
If I change to EO.Pdf v6.0.26.2 (latest on NuGet) then I get slightly different behavior - every other request fails with:
EO.Pdf.Internal.aol: This session is no longer valid. If you wish to reuse the session later, please consider calling GetCookies to retain the session cookies, then reuse these cookies through HtmlToPdfOptions.Cookies with another session. (4)
at EO.Pdf.Internal.md.h()
at EO.Pdf.Internal.md.a(ap0 A_0)
at EO.Pdf.HtmlToPdfSession.a(ap0 A_0)
at EO.Pdf.HtmlToPdf.ConvertHtml(String html, PdfDocument doc, HtmlToPdfOptions options)
...
and instead of the newly spawned child instance being killed off it's the large 1.4GB child that is killed. This is much better/cleaner behavior except for three problems - Why is spawning a 2nd instance of the child and then using the first instance to try and do the processing? And why didn't the first instance clean itself up so that there wasn't 1.4 - 1.8GB of memory being used? (If the answer is that it is lazily cleaned up on next use of the child then why is it crashing with out of memory?) And finally why is so much memory taken up in the main process even after those PDF document instances have been disposed?
I will zip up and send the 40MB HTML file through the usual support submission route.