|
Rank: Member Groups: Member
Joined: 10/14/2014 Posts: 17
|
Hello,
I have a process that creates a rather large PDF(around 7000 pages). This process retrieves a list of customers, loops through them and creates 2 PDF pages per customer. Each PDF is then added to a master PDF document. We have 2 ASP.NET web pages that contain some calls to the database which generate the data that needs to be included in the master document. In my loop, I am calling HtmlToPdf.ConvertUrl twice - once for each page - and passing query string values to generate the appropriate data.
We started receiving System.OutOfMemory exceptions, so I have looked deeper into this process and realized that the data for the first web page is exactly the same for every customer. Therefore I wanted to store the result of the first call to HtmlToPdf.ConvertUrl outside the loop and just add it for each customer inside the loop. Doing it this way, I would only be calling this URL one time. How can I add the HtmlToPdfResult object to the master PDF inside the loop? I have used PdfDocument.Merge before, however it only accepts an array of PdfDocument objects or an array of HtmlToPdfResult objects but not both. Let me know at your earliest convenience. Thank You for your time.
Bob Pinella
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,196
|
Hi,
You might want to look into the "PDF Creator" interface instead of the HTML to PDF interface. The PDF creator interface is a much lighter weight interface (but also much less powerful). However it might help you if your HTML is not overly complicated.
To answer your original question, there is no way for you to store the HtmlToPdfResult object and repeat it elsewhere. However it is very easy to append one PdfDocument object to the other. So if the data for each customer is in full pages, then yes you can simply append the first conversion's result PdfDocument to another PdfDocument (by calling PdfDocument.Merge) to merge the two. You will need to actually save the file to disk and then call the Merge function that takes the file names (the one takes PdfDocument object will do a deep merge and consume more memory). However this still may not resolve the issue for you because 7000 is a big number. Note that when you use PdfDocument.Merge, you are only merging full pages. For example, there is no way to append a few lines of text to an existing page.
Thanks!
|
|
Rank: Member Groups: Member
Joined: 10/14/2014 Posts: 17
|
Hello, Thank you for the quick reply. I was able to re-use the first page by doing the following:
HtmlToPdfResult result = HtmlToPdf.ConvertUrl(page1Url, masterPDF, options); PdfDocument page1 = result.PdfDocument.Clone(0, 1);
Then inside my loop, for all customers after the first one, I called Merge:
pdf = PdfDocument.Merge(masterPDF, page1);
Worked like a charm! Thank you for the amazing support and product!
Bob
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,196
|
Great! Glad to hear that it works for you. Please feel free to let us know if you have any more questions.
Thanks!
|
|
Rank: Member Groups: Member
Joined: 10/14/2014 Posts: 17
|
Also, for anyone else that is creating large PDFs, it is much more efficient to break up the pdf into smaller PDFs and merge them all at the end. We were experiencing an OutOfMemory exception that occurred when trying to add a PDF into an existing document that contained over 6000 pages. Basically as the document grew, adding a page kept getting slower and slower until the exception was thrown. I got around this by setting the limit for each PDF to be 500 pages. Once I hit 500, I just created a new PDF document and added the PDF to the new document. At the end, I called Merge on about 13 documents of 500 pages each. We experienced no problems with this method and it brought the PDF creation time down from over 8 hours to about 52 minutes!
Bob
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,196
|
Great observations. Thank you very much for sharing!
|
|