Welcome Guest Search | Active Topics | Sign In | Register

EO.Pdf, Out of Memory Error, and Rundll32.exe Zombies? (AGAIN!) Options
CWoods
Posted: Tuesday, December 9, 2014 1:05:21 AM
Rank: Advanced Member
Groups: Member

Joined: 7/14/2014
Posts: 40
This summer I ran into a problem with conversions that resulted in zombie instances of the rundll32.exe child processes being left around after certain errors. A fix was made to the v5 version of the code and it looked like the issue had been resolved. (See: http://www.essentialobjects.com/forum/postst8426_EOPDF-Out-of-Memory-and-Zombie-Rundll32exe.aspx) However it appears that the issue has returned from the dead... with a mutuation!

While using EO.Pdf v6.0.21.2, we process a 40MB HTML file that results in the rundll32.exe child process consuming roughly 1.8GB of memory (and over 3GB in the 64-bit service) but successfully completing the conversion. If we pass the exact same 40MB HTML file through again in the process, a second rundll32.exe child instance is spawned but then it dies almost immediately with the following error:

System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at System.String.GetStringForStringBuilder(String value, Int32 startIndex, Int32 length, Int32 capacity)
at System.Text.StringBuilder.GetNewString(String currentString, Int32 requiredLength)
at System.Text.StringBuilder.Append(String value)
at System.Text.StringBuilder.AppendFormat(IFormatProvider provider, String format, Object[] args)
at System.String.Format(IFormatProvider provider, String format, Object[] args)
at EO.Pdf.Internal.agh.a(HtmlToPdfOptions A_0, String A_1, Boolean A_2)
at EO.Pdf.Internal.agh.b(HtmlToPdfOptions A_0, String A_1, Boolean A_2)
at EO.Pdf.Internal.agh.a(asq A_0)
at EO.Pdf.Internal.ma.e.a(Byte[] A_0)
at EO.Pdf.Internal.apq.b(BinaryReader A_0)
at EO.Pdf.Internal.ma.a(apq A_0)
at EO.Pdf.HtmlToPdfSession.a(apq A_0)
at EO.Pdf.HtmlToPdf.ConvertHtml(String html, PdfDocument doc, HtmlToPdfOptions options)
...

If we pass the same 40MB HTML file through a third time, another instance of rundll32.exe is spawned and the file converts w/out an error - but now we have 2 child processes using 1.8GB of memory + the service using up to 4GB of memory.

If we pass the same 40MB HTML file through a fourth and fifth time, it's like the 2nd pass in that rundll32.exe instances are spawned but die off almost immediately leaving the 2 1.8GB child processes from before still laying around. This seems to block all further requests on this file. I did not let the app sleep loop to see if the child processes eventually are cleaned up or not but they do seem to go away when the process exits (unlike the first zombie case).

If I change to EO.Pdf v6.0.26.2 (latest on NuGet) then I get slightly different behavior - every other request fails with:

EO.Pdf.Internal.aol: This session is no longer valid. If you wish to reuse the session later, please consider calling GetCookies to retain the session cookies, then reuse these cookies through HtmlToPdfOptions.Cookies with another session. (4)
at EO.Pdf.Internal.md.h()
at EO.Pdf.Internal.md.a(ap0 A_0)
at EO.Pdf.HtmlToPdfSession.a(ap0 A_0)
at EO.Pdf.HtmlToPdf.ConvertHtml(String html, PdfDocument doc, HtmlToPdfOptions options)
...

and instead of the newly spawned child instance being killed off it's the large 1.4GB child that is killed. This is much better/cleaner behavior except for three problems - Why is spawning a 2nd instance of the child and then using the first instance to try and do the processing? And why didn't the first instance clean itself up so that there wasn't 1.4 - 1.8GB of memory being used? (If the answer is that it is lazily cleaned up on next use of the child then why is it crashing with out of memory?) And finally why is so much memory taken up in the main process even after those PDF document instances have been disposed?

I will zip up and send the 40MB HTML file through the usual support submission route.
eo_support
Posted: Tuesday, December 9, 2014 6:58:41 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,229
Hi,

With a 40M HTML file, you would be really pushing the limit of the product.

Because of .NET uses automatic memory management, sometimes memory usage are unpredictable. First, the memory you see in the task manager may not be accurate. Second, the code really does not have precise control of when a memory block can be released. Third, which is the one that can impact your case the most, is memory fragmentation. EO.Pdf uses large chunk of memory frequently. What can happen after the first conversion is, even if the second conversion uses exactly the same amount of memory as the first conversion and the total amount of available memory is the same (after lazy release), the allocation may still fail because there is no single continuous chunk of memory that is large enough for to satisfy the allocation, where as this is not a problem for the first conversion.

The best option for you would be to split the HTML file into multiple small HTML files, convert each file separately, and the use Document.Merge to merge them together. If you want to try to clean it up, you can try to clean some memory up by calling ConvertHtml(string.Empty, new PdfDocument) after every real conversion. That will clean up some lazy release stuff. But for your file size you probably still want to split it up.

Thanks!
CWoods
Posted: Tuesday, December 9, 2014 10:30:56 PM
Rank: Advanced Member
Groups: Member

Joined: 7/14/2014
Posts: 40
We can't really split the HTMLs up for various reasons which we've covered in the past. The file itself converts as is and we've had success converting files much larger than this (up to 130MB).

The problem is that the conversion leaves a child process laying around with 1.4 - 1.8 GB and then when the next file that comes through that child may or may not fail (but if you go and independently try the failed file it converts w/out issue). It's a bit misleading and hard to reproduce if you don't end up reprocessing just the right set of files in just the right order. I just ran the same filing through multiple times because it made it terribly easy to reproduce the issue... once I had figured out what was going on.

Regarding the memory reported. I'm looking at the Virtual Size, Private Bytes, and Working Set reported by ProcessExplorer for a test app + the child(ren) rundll32.exe processes spawned by EO.Pdf. I'd say that's fairly accurate. Now that could be memory tied up waiting for final garbage collection or maybe it's objects in some cached memory pools and not technically in use by the code but rather held by some low level memory manager. Even if it's due to some fragmentation it would have to be a ridiculous amount of fragmentation to tie up that memory - at a 65KB system page allocation granularity we're talking at least one allocation on 16,384 pages to block up 1.8GB of memory. If there's still that many allocations left around then I'd say there's some issues. Even 3 minutes of idle time where the child hasn't been used sees no reduction in the memory (which suggests it's not stuff sitting in garbage collection) and even if it was memory sitting waiting for GC the next conversion request's memory requests should force the GC to make space available.

As to precise control of when a memory block is released, if we're talking about .NET Framework code yeah that's a bit of a trick though you can force garbage collections (at the cost of performance) or you could add a check to see the amount of memory used and then tear down the child if it's over an unusually high threshold (which should force garbage collections). If we're talking about native code (which is what I thought was being loaded into the rundll32.exe instances) then you should in fact have precise control unless it's down in WebKit - which is obviously outside of your control. If it's some data structures that has to be held in memory after conversion while the EO.Pdf functionality in the host process copies it back into the PdfDocument class then you should know when that is complete and should be able to add a call back to the child to flush the data structure.

Regarding memory fragmentation, it would take an unbelievable amount of memory fragmentation after the code has completed conversion to tie up 1.8GB of memory (see above). Is it possible? Sure. Is it likely? No. Is it something you can test for? Yes. Is it something that I can test for? No. My bet is that while there many be some fragmentation, there's likely one or more data structures that simply aren't cleared until the next document is processed and in this particular case there's just not enough memory to copy in the new document and start processing before the code gets to clearing out the old structures.

Regarding the suggestion to do an empty convert. If this worked then I would think that this is a suitable work around at least. The problem though is that there could be multiple child instances spawned by EO.Pdf due to concurrent conversion requests - let's say there's 6 instances. Is there something in the code that's going to guarantee that requests on the same thread go to the same child instance? If so, great - we at least have a reasonable work around. If not then there's only <17% chance that the empty conversion request uses the same child instance as the actual conversion request that was just completed - and we're no better off.

eo_support
Posted: Tuesday, December 9, 2014 11:10:34 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,229
Hi,

You will have to split your HTML files. Memory optimization is something that we can do maybe once or twice, but we can't do it again and again since we can only squeeze out so much. Additionally, it is very possible that we will add some features and then it will consume more memory thus reducing the process ability. So it won't be a valid argument for you that if it has run this file before then it should process this file now. This just like arguing that Windows 3.1 can run on a machine with 2M memory then Windows 8 should be able to run on a machine with 2M memory as well. This adds to the fact that rundll32.exe is a mix of managed code (our code) and a very complex unmanaged code project (WebKit), which makes it much more difficult for memory optimization.

The remaining rundll32.exe is by design, not a bug. By design the process will sit there wait for additional work (similar to IIS's worker process, but it will eventually exit if you let it sits long enough). The fact that you are seeing 1.4G to 1.8G being used by the process itself is a warning sign ---- you should never allow it go that high in a production environment. Supposedly these are some "left overs" from the previous conversion that will be released when another conversion task comes into this worker process. It is during the process you will see great fragmentation which will fail the memory allocation of big chunks at some point. We've seen that very often when we test/optimize our application on a very modest machine in our environment (the machine has very modest configuration so that we can see memory problems easily).

You are correct about ConvertHtml(string.Empty) may not hit the rundll32.exe that's holding up the memory. Using ConvertHtml(string.Empty) is only valid if you have a single conversion thread. If you have multiple threads and you wish to clean up the worker process, then you will need to run the conversion in a separate AppDomain and then unload your AppDomain completely. This will kill off all worker processes. That is the only way to thoroughly clean up everything.

We will be very happy to work with you if you need any insight on our internal implementation thus helping you better utilizing our features/obtaining optimal output, however you must understand that it is not practical to expect us to do some magic optimization on our end to fit whatever file size you may have. An 40M HTML file is an extremely large file that is way above the average. We might be able to spend a huge amount of effort on this to improve it a little bit and squeeze this file through for you but then you will have a even bigger file that will break it. So we won't go down that road. This means the only reliable approach for you is to split such huge files.

Thanks!
CWoods
Posted: Wednesday, December 10, 2014 12:24:52 AM
Rank: Advanced Member
Groups: Member

Joined: 7/14/2014
Posts: 40
I do understand the child process design. We covered it back this summer when I ran into the first "zombie" issue (where the children never went away and were never used again). It's expensive to setup, load, and then tear down that environment so you're keeping a reasonable number of instances around to minimize that performance penalty. As part of the stuff we went through regarding the "zombies" this summer - I did try each request in it's own AppDomain (which led to the discovery of the GDI leak - which you fixes) and I know just how much that kills performance in terms of number of conversions per minute. No issues with that there.

We understand that you can't/won't dive into the WebKit code to try and address issues/memory footprint. No issues with that either.

"So it won't be a valid argument for you that if it has run this file before then it should process this file now. This just like arguing that Windows 3.1 can run on a machine with 2M memory then Windows 8 should be able to run on a machine with 2M memory as well."

No that's not the same thing at all. I understand that 40MB is way outside the normal range. I don't disagree with you personally - it's not what HTML was meant for. We're not expecting any "magical optimization" to fit a 30 ton truck into a standard sized mailbox. If an over sized HTML file fails to convert because it's too complex for the 32-bit process space to hold then we accept that (and we're waiting for the Blink based version).

But that is not what we're actually talking about here.

The 40MB file converts - if it didn't okay but it does. What comes after it may or may not convert - it might be only a 2MB HTML file. What we're talking about here is leftover state/data from the prior conversion request that doesn't leave enough room for the new conversion to get to the point where it is discarding the "left overs" state/data from the prior conversion request before it crashes.

"We might be able to spend a huge amount of effort on this to improve it a little bit and squeeze this file through for you but then you will have a even bigger file that will break it. So we won't go down that road."

What we're asking is that you check to:

a) Verify that there's not actually a memory leak while processing this file that is causing this situation.
b) See if there's any reasonable way to move the cleanup of any "left overs" from the previous conversion to the very start of the next request (before the HTML is passed to it) or better yet - after you're completely done with the current request (so that the memory isn't being held by the process).

Can we at least go that far down the road?

Maybe it's WebKit holding all the memory until the next request and it's as simple as discarding the WebKit instance and creating another.
Maybe there's a few data structures that were left dangling in your managed code that could/should have been set to null.
Maybe it's a leak in WebKit and there's some simple check that you can put in in the parent/child code that checks the size and automatically recycles the child.
Maybe it is a little more complex and involves some minor refactoring or some additional simple communication between the parent and child functionality so that you can explicitly call a some memory cleanup functionality.


You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.