Welcome Guest Search | Active Topics | Sign In | Register

Generating a Large PDF Document Options
Bob Pinella
Posted: Tuesday, February 24, 2015 6:25:51 PM
Rank: Member
Groups: Member

Joined: 10/14/2014
Posts: 17
Hello,

I have a process that creates a rather large PDF(around 7000 pages). This process retrieves a list of customers, loops through them and creates 2 PDF pages per customer. Each PDF is then added to a master PDF document. We have 2 ASP.NET web pages that contain some calls to the database which generate the data that needs to be included in the master document. In my loop, I am calling HtmlToPdf.ConvertUrl twice - once for each page - and passing query string values to generate the appropriate data.

We started receiving System.OutOfMemory exceptions, so I have looked deeper into this process and realized that the data for the first web page is exactly the same for every customer. Therefore I wanted to store the result of the first call to HtmlToPdf.ConvertUrl outside the loop and just add it for each customer inside the loop. Doing it this way, I would only be calling this URL one time. How can I add the HtmlToPdfResult object to the master PDF inside the loop? I have used PdfDocument.Merge before, however it only accepts an array of PdfDocument objects or an array of HtmlToPdfResult objects but not both. Let me know at your earliest convenience. Thank You for your time.

Bob Pinella
eo_support
Posted: Tuesday, February 24, 2015 6:48:29 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,196
Hi,

You might want to look into the "PDF Creator" interface instead of the HTML to PDF interface. The PDF creator interface is a much lighter weight interface (but also much less powerful). However it might help you if your HTML is not overly complicated.

To answer your original question, there is no way for you to store the HtmlToPdfResult object and repeat it elsewhere. However it is very easy to append one PdfDocument object to the other. So if the data for each customer is in full pages, then yes you can simply append the first conversion's result PdfDocument to another PdfDocument (by calling PdfDocument.Merge) to merge the two. You will need to actually save the file to disk and then call the Merge function that takes the file names (the one takes PdfDocument object will do a deep merge and consume more memory). However this still may not resolve the issue for you because 7000 is a big number. Note that when you use PdfDocument.Merge, you are only merging full pages. For example, there is no way to append a few lines of text to an existing page.

Thanks!
Bob Pinella
Posted: Wednesday, February 25, 2015 9:27:28 AM
Rank: Member
Groups: Member

Joined: 10/14/2014
Posts: 17
Hello,
Thank you for the quick reply. I was able to re-use the first page by doing the following:

HtmlToPdfResult result = HtmlToPdf.ConvertUrl(page1Url, masterPDF, options);
PdfDocument page1 = result.PdfDocument.Clone(0, 1);

Then inside my loop, for all customers after the first one, I called Merge:

pdf = PdfDocument.Merge(masterPDF, page1);

Worked like a charm! Thank you for the amazing support and product!

Bob
eo_support
Posted: Wednesday, February 25, 2015 10:18:55 AM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,196
Great! Glad to hear that it works for you. Please feel free to let us know if you have any more questions.

Thanks!
Bob Pinella
Posted: Monday, March 2, 2015 10:21:46 AM
Rank: Member
Groups: Member

Joined: 10/14/2014
Posts: 17
Also, for anyone else that is creating large PDFs, it is much more efficient to break up the pdf into smaller PDFs and merge them all at the end. We were experiencing an OutOfMemory exception that occurred when trying to add a PDF into an existing document that contained over 6000 pages. Basically as the document grew, adding a page kept getting slower and slower until the exception was thrown. I got around this by setting the limit for each PDF to be 500 pages. Once I hit 500, I just created a new PDF document and added the PDF to the new document. At the end, I called Merge on about 13 documents of 500 pages each. We experienced no problems with this method and it brought the PDF creation time down from over 8 hours to about 52 minutes!

Bob
eo_support
Posted: Monday, March 2, 2015 8:18:32 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,196
Great observations. Thank you very much for sharing!


You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.