Welcome Guest Search | Active Topics | Sign In | Register

EO.Pdf HTML to PDF conversion takes a long time Options
An
Posted: Monday, June 15, 2020 5:59:40 PM
Rank: Newbie
Groups: Member

Joined: 3/23/2020
Posts: 5
I upgraded newest version of EO.Pdf. (latest dlls 20.1.45.0 with new key)

EO.Pdf HTML to PDF took longer than expected sometimes, or it timeout with a blank content.
I have adjusted MinTimeLoad = 10 seconds and MaxTimeLoad = 35 seconds.


here is sample codes :
var options = new HtmlToPdfOptions
{
MinLoadWaitTime = 10000,
MaxLoadWaitTime = 35000,
AutoFitX = HtmlToPdfAutoFitMode.ScaleToFit,
PageSize = new SizeF(PdfPageSizes.A4.Height, PdfPageSizes.A4.Width),
OutputArea = new RectangleF(
0, verticalTopMargin,
PdfPageSizes.A4.Height,
PdfPageSizes.A4.Width - verticalTopMargin - verticalBottomMargin),
FooterHtmlFormat = "<span style='float: right;font-size: 12px;padding: 0;margin: 0 30px 0 0;'>Page {page_number}-{total_pages}</span>"
};
var pdfDocs = Enumerable.Empty<PdfDocument>();
foreach (var url in urls)
{

var pdfDocument = new PdfDocument { EmbedFont = false };
HtmlToPdf.ConvertUrl(url, pdfDocument, options);
pdfDocs = pdfDocs.Concat(pdfDocument.Yield());
}

your help is appreciated.

Thanks,
An
eo_support
Posted: Tuesday, June 16, 2020 11:46:43 AM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,200
Hi,

The most common cause for conversion delay is during the loading stage, not during the conversion stage. When you call ConvertUrl, the internal browser engine would try to load that Url, wait for the entire paging as well as dependency resources (JavaScript, images, etc) to finish loading before conversion start. Most of the time the delay occurs while the converter is waiting for those resources to be loaded.

If you have access to the web server, you can check the web server's log to see if there is anything holding up there. For example, if your page contains a chart that is dynamically generated from a database, and the code that access the database takes a long time, then that will hold up the image and subsequently hold up the converter. You should be able to see this kind of delays from your web server logs.

If you not have access to the web server, you can use a traffic monitor to monitor the traffic between the converter and the web server. If there is any delay in the communication, the traffic monitor should show it as well.

Thanks!
An
Posted: Thursday, July 9, 2020 10:00:14 AM
Rank: Newbie
Groups: Member

Joined: 3/23/2020
Posts: 5
Hi,

After we upgraded to the newest version, we have an issue with pages data not being rendered on a DotNet application, we are trying to make PDFs from using url conversion. Page contain charts that dynamically generated from a database. Some page returns in second, some other pages take longer depend on how complex of the chart / query. Per your suggestion, we have monitor the traffic and nothing holding up in web server log.
The suggestion to use min-wait-time is also not working for our cases. The target pages to be converted into PDF may vary from single page to many, data-"heavy" pages, Thus in order to be sure, we would need to impose a min. wait time of several seconds. Data doesn’t load that slow, but if we go close to the actual load time, eo.pdf begins conversion before data is ready anyway making rendering of the dynamic portions of the page dependent on data-fetch being completed doesn’t work either and the result is back content for some data heavy pages, and the performance is slow.

Other option that we have tried: HtmlToPdfTriggerMode.Auto - per your definition “Automatically triggers conversion as soon as the page contents (styles, images, etc) are loaded “. This option is no longer working in the newest version. It worked in old version.
Would you please look into your HtmlToPdfTriggerMode.Auto for the newest EOPDF version?

Any suggestions are appreciated.

Kind regards,
eo_support
Posted: Thursday, July 9, 2020 10:41:27 AM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,200
Hi,

For TriggerMode.Auto it only check script/image tags. However if you dynamically generates chart with JavaScript, it is not possible for the converter to know exactly when you have finished generating your chart. The only two options for you is either you always wait for a long time, or you explicitly "tell" the converter that your contents are ready and the converter can start. To use the second option you can use manual trigger:

https://www.essentialobjects.com/doc/pdf/htmltopdf/trigger.aspx

The reason that the converter can't tell when your JavaScript code has finished generating your chart because there is no clear definition about "finish" on anything in a web page. For example:

1. If your web page contains a digital clock that keeps ticking forever (updated by JavaScript code), then obviously there is no definition of "finish" in this page;
2. If your page contains a JavaScript generated bar chart, but your code plays an animation so that the bar grows from zero height to full height, then there is no way the converter engine will know when your bar has finished growing;

The only way to precisely capture when your wanted to capture is for you to explicitly tell the converter when to capture the output. If that is not doable, then the only option is for you to wait long enough.

Hope this makes sense to you.

Thanks!
An
Posted: Monday, August 10, 2020 11:01:19 AM
Rank: Newbie
Groups: Member

Joined: 3/23/2020
Posts: 5
Thank you so much for suggestions.
Per your instruction : https://www.essentialobjects.com/doc/pdf/htmltopdf/trigger.aspx,
I applied condition below to all my js script files.
if (window.eoapi && eoapi.isEOPdf())
eoapi.convert();

and added set TriggerMode = DUAL. (When TriggerMode is set to Dual, the converter will still wait for page contents to be loaded. The conversion will start only after page contents are loaded and eoapi.convert() is called )

It worked well from localhost. When I deployed to server for scheduler Quartz to run desks atomically , some decks are return and sending emails message with PDF contents. other desks are getting Time Out message

Message: System.Exception: Time out expired before the page can be loaded.
at EO.Internal.srem.scxu(sqzc btk, String btl, String btm, String btn, Int32 bto, Int32 btp, String btq, Boolean btr)
at EO.Internal.srem.eedt(sqzc btc, String btd, String bte, String btf, Int32 btg, Int32 bth, String bti, Boolean btj)
at EO.Pdf.HtmlToPdfSession.eedt(sqzc yr, String ys, String yt, Int32 yu, Int32 yv, String yw, Boolean yx)
at EO.Pdf.HtmlToPdfSession.eedt(sqzc yy, String yz, String za, Boolean zb)
at EO.Pdf.HtmlToPdfSession.LoadUrl(String url)
at EO.Pdf.HtmlToPdf.pvwn.lhie()
at EO.Internal.srej.escb[a](gwpu`1 bsj)
at EO.Pdf.HtmlToPdf.ConvertUrl(String url, PdfDocument doc, HtmlToPdfOptions options)

here is my function call :
public Stream CreatePdf(IEnumerable<string> urls, IEnumerable<string> additionalHeaders)
{
const float verticalTopMargin = 0.2f, verticalBottomMargin = 0.5f;
var options = new HtmlToPdfOptions
{
// MinLoadWaitTime = 40000,
// MaxLoadWaitTime = 60000,
TriggerMode = HtmlToPdfTriggerMode.Dual,
AutoFitX = HtmlToPdfAutoFitMode.ScaleToFit,
PageSize = new SizeF(PdfPageSizes.A4.Height, PdfPageSizes.A4.Width),
OutputArea = new RectangleF(
0, verticalTopMargin,
PdfPageSizes.A4.Height,
PdfPageSizes.A4.Width - verticalTopMargin - verticalBottomMargin),
FooterHtmlFormat = "<span style='float: right;font-size: 12px;padding: 0;margin: 0 30px 0 0;'>Page {page_number}-{total_pages}</span>"
};
if (additionalHeaders.Any())
{
options.AdditionalHeaders = additionalHeaders.ToArray();
}
var pdfDocs = Enumerable.Empty<PdfDocument>();
foreach (var url in urls)
{
var pdfDocument = new PdfDocument { EmbedFont = false };
pdfDocument.Info.Author = "TMC Change Board";
pdfDocument.Info.CreationDate = DateTimeHelper.GetSeattleNow();
pdfDocument.Info.Creator = "TMC Change Board";
try
{
HtmlToPdf.ConvertUrl(url, pdfDocument, options);
pdfDocs = pdfDocs.Concat(pdfDocument.Yield());
}catch(Exception ex)
{
throw ex;
}
}
var pdfMerged = PdfDocument.Merge(pdfDocs.ToArray());
var stream = new MemoryStream(20480);
pdfMerged.Save(stream);
stream.Position = 0;

return stream;
}


Any suggestions are appreciated.

Thanks,
An
eo_support
Posted: Monday, August 10, 2020 2:18:26 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,200
Hi,

Your code looks fine and the issue you are seeing does not have anything to do with trigger. The PDF converter works in three stages:

1. Load your web page. Yours failed in this step;
2. Wait for triggers to be satisfied. Triggering mode, images, JavaScript, etc affects this step. If this step fails, you will get an "operation times out" error;
3. Convert the page to PDF;

The most common cause for failure on step 1 is a system overload somewhere. This can be either on your web server or on your local machine. In that case the only solution is to slow down or ignore the error and retry (which itself often serves as an automatic slow down of some sort).

If you get this problem when the traffic is very low, or the converter can't recover after that (for example, no matter how many times you retry it always fails once it fails once), then it would indicate there might be a problem on our side. In that case you can try to update to the latest build (we just updated the browser engine in 20.1.88 and then fixed a number of other issues later) and try to create a test project demonstrating the problem. We can then use the test project to try to reproduce the problem in our environment and see what we can find. See here for more information on test project:

https://www.essentialobjects.com/forum/test_project.aspx

Thanks!


You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.