Welcome Guest Search | Active Topics | Sign In | Register

HtmlToPdf.ConvertHtml Error Options
Carl De
Posted: Thursday, September 12, 2019 1:18:00 PM
Rank: Newbie
Groups: Member

Joined: 2/18/2019
Posts: 8
Hello,

I have been performing numerous load tests of our new pdf generation application. We were experiencing frequent thread blocking errors that were locking up our servers. This was using version 17.

We have since upgraded to version 19 and it has been a major improvement.

Some background:

To validate the improvements I created a simple console app that loaded up several hundred previously validated html files. I then made multi-threaded calls to ConvertHtml in order to find the most reliable way to call the conversion. I put the conversion methods within a SemaphoreSlim block (with MaxConcurrentTaskCount at a marginally higher number than the slim limit.) MaxLoadWaitTime = 60000 (60s).

I compared the following 5 methods within a large number of threaded tasks, and tested each at various slim/MaxConcurrentTaskCount amounts to find the most efficient and reliable method.

Code: C#
1)   HtmlToPdf.ConvertHtml(html, doc, options);

2)  HtmlToPdf.ConvertHtml(html, doc, options);
     HtmlToPdf.ClearResult();

3)  var t1 = Task.Run(() =>
     {						
	return HtmlToPdf.ConvertHtml(html, doc, options);
     });
     var result = t1.Result;

4)  using (var session = HtmlToPdfSession.Create(options))
    {
	session.LoadHtml(html);
	session.RenderAsPDF(doc);
     }

5)  var t2 = Task.Run(() =>
     {
	using (var session = HtmlToPdfSession.Create(options))
	{
		session.LoadHtml(html);
		session.RenderAsPDF(doc);
	}
	return true;
     });
     var result2 = t2.Result;

I also ran each test with each a static class and a non-static class.

If you are interested in my findings, option 3 was by far the most consistently reliable and performant. The worst was far and away option 2. It threw many errors and did not handle the load at all.

Current Error:

For the most part my load tests have been successful, but I have been seeing an error about 1/1000 conversions. Please note, in my load tests I am loading about 1000 data files and sending each of them through multiple times. Sometimes they pass, occasionally they don't, so it's likely not the data.

Here are the relevant sections of code:

Code: C#
private static SemaphoreSlim _slim = new SemaphoreSlim(40, 40);
      HtmlToPdf.MaxConcurrentTaskCount = 85;
      
           var doc = new PdfDocument();
            var options = new HtmlToPdfOptions()
            {
                OutputArea = new RectangleF(0.5f, 0.5f, 7.5f, 10f),
                PageSize = new SizeF(8.5f, 11f),
                UserStyleSheet = _styleSheets.CdnCss,
                StartPageIndex = 0,
                MaxLoadWaitTime = 60000
            };

// these documents require the header section (variable size) to appear on every page.
// I generate the header (a complete html document) to determine it's size, so i can adjust the output area accordingly.

                if (headerOnEachPage)    // these documents require the header section (variable size) to appear on every page.
                {
                    var head = new PdfDocument();
                    var task1 = Task.Run(() =>
                    {
                        return HtmlToPdf.ConvertHtml(htmlTemplates.InvoiceHeader, head, options);
                    });
                    var result1 = task1.Result;
                    options.OutputArea = new RectangleF(0.5f, result1.LastPosition + 0.5f, 7.5f, 10f - result1.LastPosition);
                    options.HeaderHtmlFormat = htmlTemplates.InvoiceHeader;
                    options.HeaderHtmlPosition = 0.5f;
                }

// the other types have the header section built into the body so it only appears on the first page.
                var task2 = Task.Run(() =>
                {
                    return HtmlToPdf.ConvertHtml(htmlTemplates.BodyTemplate, doc, options);
                });
                var result2 = task2.Result
;

This part all works and functions fine except that 1/1000 or so.


I have logged the stack trace for those errors. It certainly appears to be internal to EO. My hope is that you can help me decipher it, and ideally suggest a solution, or potential reasons for the error.

Thank you in advance.

This appears to be the main error:

"TypeName": "System.NullReferenceException",
"Message": "Object reference not set to an instance of an object.",
"Detail": "System.NullReferenceException: Object reference not set to an instance of an object.
EO.Internal.ob.f() at offset 107
EO.Internal.ob..ctor(EO.Internal.at A_0,EO.Pdf.HtmlToPdfOptions A_1) at offset 137
EO.Pdf.HtmlToPdfSession.a(EO.Pdf.HtmlToPdfOptions A_0) at offset 458
EO.Pdf.HtmlToPdfSession..ctor(EO.Pdf.HtmlToPdfOptions A_0,EO.Pdf.HtmlToPdfSession A_1) at offset 145
EO.Pdf.HtmlToPdfSession.Create(EO.Pdf.HtmlToPdfOptions options) at offset 41
EO.Pdf.HtmlToPdf.ConvertHtml(System.String html,EO.Pdf.PdfDocument doc,EO.Pdf.HtmlToPdfOptions options) at offset 46
System.Threading.Tasks.Task`1.InnerInvoke() at offset 79
System.Threading.Tasks.Task.Execute() at offset 73
System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext,System.Threading.ContextCallback callback,System.Object state,Boolean preserveSyncCtx) at offset 357
System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext,System.Threading.ContextCallback callback,System.Object state,Boolean preserveSyncCtx) at offset 20
System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task& currentTaskSlot) at offset 773
System.Threading.Tasks.Task.ExecuteEntry(Boolean bPreventDoubleExecution) at offset 146
System.Threading.Tasks.ThreadPoolTaskScheduler.TryExecuteTaskInline(System.Threading.Tasks.Task task,Boolean taskWasPreviouslyQueued) at offset 70
System.Threading.Tasks.TaskScheduler.TryRunInline(System.Threading.Tasks.Task task,Boolean taskWasPreviouslyQueued) at offset 169
System.Threading.Tasks.Task.WrappedTryRunInline() at offset 58
System.Threading.Tasks.Task.InternalWait(Int32 millisecondsTimeout,System.Threading.CancellationToken cancellationToken) at offset 366
System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification) at offset 39
CHR.Financials.DocDelPDFGeneration.API.Repositories.ConvertToPdfRepository.CreatePdfFromHtml(CHR.Financials.DocDelPDFGeneration.API.Models.HtmlTemplatesModel htmlTemplates,CHR.Financials.DocDelPDFGeneration.API.Models.InvoiceSettingsModel settings) at offset 946
CHR.Financials.DocDelPDFGeneration.API.Managers.DocDelPdfGenerationManager.CreateInvoicePdf(CHR.Financials.Invoicing.Domain.Models.Invoices.EisInvoices invoiceDataModel,CHR.Financials.CustomerRules.Api.ServiceModel.DTOs.InvoiceAppearanceSettings appearanceSettings,Boolean writeTestOutput) at offset 591
ServiceStack.Host.ServiceRunner`1.Execute(ServiceStack.Web.IRequest req,System.Object instance,TRequest requestDto) at offset 520
",
"ExceptionId": "System.NullReferenceException at f() in EO.Internal.ob",
"Source": "EO.Pdf",
"TargetSite": "EO.Pdf.dll!EO.Internal.ob.f()",
"HResult": -2147467261,
"IsDuplicate": false,
"CallingAssembly": "EO.Pdf, Version=19.2.42.0, Culture=neutral, PublicKeyToken=e92353a6bf73fffc",

This also comes with a matching aggregation error "One or more errors occurred" that I catch, add the location to the message and throw again with

Code: C#
catch (AggregateException ae)
            {
                var ex = (ae.Flatten());
                throw new InvalidOperationException("AggregateException - " + ex.Message, ex.InnerException);

            }


"TypeName": "System.InvalidOperationException",
"Message": "AggregateException - One or more errors occurred.",
"Detail": "System.InvalidOperationException: AggregateException - One or more errors occurred. ---> System.NullReferenceException: Object reference not set to an instance of an object.
EO.Internal.ob.f() at offset 106
EO.Internal.ob..ctor(EO.Internal.at A_0,EO.Pdf.HtmlToPdfOptions A_1) at offset 136
EO.Pdf.HtmlToPdfSession.a(EO.Pdf.HtmlToPdfOptions A_0) at offset 457
EO.Pdf.HtmlToPdfSession..ctor(EO.Pdf.HtmlToPdfOptions A_0,EO.Pdf.HtmlToPdfSession A_1) at offset 144
EO.Pdf.HtmlToPdfSession.Create(EO.Pdf.HtmlToPdfOptions options) at offset 40
EO.Pdf.HtmlToPdf.ConvertHtml(System.String html,EO.Pdf.PdfDocument doc,EO.Pdf.HtmlToPdfOptions options) at offset 45
System.Threading.Tasks.Task`1.InnerInvoke() at offset 78
System.Threading.Tasks.Task.Execute() at offset 72
CHR.Financials.DocDelPDFGeneration.API.Repositories.ConvertToPdfRepository.CreatePdfFromHtml(CHR.Financials.DocDelPDFGeneration.API.Models.HtmlTemplatesModel htmlTemplates,CHR.Financials.DocDelPDFGeneration.API.Models.InvoiceSettingsModel settings) at offset 946
CHR.Financials.DocDelPDFGeneration.API.Managers.DocDelPdfGenerationManager.CreateInvoicePdf(CHR.Financials.Invoicing.Domain.Models.Invoices.EisInvoices invoiceDataModel,CHR.Financials.CustomerRules.Api.ServiceModel.DTOs.InvoiceAppearanceSettings appearanceSettings,Boolean writeTestOutput) at offset 591
ServiceStack.Host.ServiceRunner`1.Execute(ServiceStack.Web.IRequest req,System.Object instance,TRequest requestDto) at offset 520
ServiceStack.Host.ServiceExec`1.Execute(ServiceStack.Web.IRequest request,System.Object instance,System.Object requestDto,System.String requestName) at offset 393
ServiceStack.Host.ServiceRequestExec`2.Execute(ServiceStack.Web.IRequest requestContext,System.Object instance,System.Object request) at offset 119
ServiceStack.Host.ServiceController.ManagedServiceExec(ServiceStack.Host.ServiceExecFn serviceExec,ServiceStack.IService service,ServiceStack.Web.IRequest request,System.Object requestDto) at offset 251
ServiceStack.Host.ServiceController+<>c__DisplayClass36_0.<RegisterServiceExecutor>b__0(ServiceStack.Web.IRequest req,System.Object dto) at offset 179
ServiceStack.Host.ServiceController.Execute(System.Object requestDto,ServiceStack.Web.IRequest req) at offset 177
ServiceStack.HostContext.ExecuteService(System.Object request,ServiceStack.Web.IRequest httpReq) at offset 94
ServiceStack.Host.Handlers.GenericHandler+<>c__DisplayClass14_1.<ProcessRequestAsync>b__0(System.Threading.Tasks.Task t) at offset 170
ServiceStack.AsyncExtensions.Continue(System.Threading.Tasks.Task task,System.Func`2[System.Threading.Tasks.Task,TOut] next) at offset 111
ServiceStack.Host.Handlers.GenericHandler.ProcessRequestAsync(ServiceStack.Web.IRequest httpReq,ServiceStack.Web.IResponse httpRes,System.String operationName) at offset 649
ServiceStack.Host.Handlers.HttpAsyncTaskHandler.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext context,System.AsyncCallback cb,System.Object extraData) at offset 458
System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() at offset 389
System.Web.HttpApplication.ExecuteStepImpl(IExecutionStep step) at offset 224
System.Web.HttpApplication.ExecuteStep(IExecutionStep step,Boolean& completedSynchronously) at offset 131
System.Web.HttpApplication+PipelineStepManager.ResumeSteps(System.Exception error) at offset 1251
System.Web.HttpApplication.BeginProcessRequestNotification(System.Web.HttpContext context,System.AsyncCallback cb) at offset 112
System.Web.HttpRuntime.ProcessRequestNotificationPrivate(System.Web.Hosting.IIS7WorkerRequest wr,System.Web.HttpContext context) at offset 375
System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr rootedObjectsPointer,IntPtr nativeRequestContext,IntPtr moduleData,Int32 flags) at offset 864
",
"ExceptionId": "System.InvalidOperationException at CreatePdfFromHtml(HtmlTemplatesModel htmlTemplates,InvoiceSettingsModel settings) in CHR.Financials.DocDelPDFGeneration.API.Repositories.ConvertToPdfRepository",
"Source": "Financials.DocDelPDFGeneration.API.Repositories",


Thank you again for any advice.
eo_support
Posted: Thursday, September 12, 2019 4:48:35 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,218
Hi,

Thanks for the detailed information. Internally EO.Pdf maintains a pool of WebView objects and then uses dynamically checks in/out WebView objects from that pool as it performs conversions. The error you are having appears to indicate that one of the WebView is marked as available but in fact has already crashed/been recycled. This should not happen as our library supposes to automatically track these situations and mark them as unavailable. So this appears to be a bug on our end.

This is just our preliminary assessment --- as we continue investigating this issue we may have more/different findings. We will post again as soon as we have more information.

Thanks!
eo_support
Posted: Tuesday, September 17, 2019 9:55:41 AM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,218
Hi,

We have posted a new build that should alleviate this problem. Please download it from our download page and let us know how it goes.

The root of the problem is browser engine crashes. It is not technical possible for us to completely eliminate browser engine crash. There are many reasons that can cause browser engine to crash, such as temporary resource overload or even browser engine code errors. As such the focus is to recover when a crash occurs. This is similar to the notion that there is no way to completely eliminate database deadlock, so a database server's focus is not to eliminate deadlock, but to detect and break the deadlock at the price of causing client application to fail.

The new builds includes two changes regarding this issue:

1. Previous EO.Pdf already has built-in code to detect and recover from a browser engine crash. However your stack trace revealed one location such check was not properly performed. We have fixed that issue.

2. Additionally, we have also added code from our browser engine to better report such errors and also enhanced ConvertHtml/ConvertUrl to try the conversion again with a different engine (away from the engine that just crashed). This should significantly reduce the failure rate;

Since we were making changes to ConvertHtml/ConvertUrl, we did not comment on the 5 methods you listed in your original post. We will comment now with both before and after change information.

Before the change

1. There are essentially only two variants. One calling the converter straight and one calling it with Task.Run using thread pool. Exactly which one works better for you depends on your application. For example, if your application runs only a single thread and you try to perform the conversion one by one, then obviously it can be slow. However if your application already create multiple threads and run the conversion in those threads then the net effect will be similar to using Task.Run;

2. Method #1, #2 and #4 belongs to the "straight" variant. There is very little difference between them. There is very little performance difference between #1 and #2 because ClearResult is simply setting a thread static variable to null (however it does have benefit on memory usage especially in multiple threaded environment);

3. Method #1 and #4 are identical because the code in #4 is exactly what ConvertHtml does;

4. For the same reason, #3 and #5 are identical;

After the change

After the change #1 and #4 are no longer identical. Method #1 now includes logic to automatically retry the conversion with a different engine. Method #4 does not have this logic. The same difference exists in method #3 and #5.

Hope this provide enough details to you. Please feel free to let us know if you still have any questions.

Thanks!

Carl De
Posted: Tuesday, September 17, 2019 1:12:32 PM
Rank: Newbie
Groups: Member

Joined: 2/18/2019
Posts: 8
Thank you for the quick turnaround and the explanations!

I will update and start load testing this afternoon.
Carl De
Posted: Thursday, September 19, 2019 4:04:05 PM
Rank: Newbie
Groups: Member

Joined: 2/18/2019
Posts: 8
Thank you again. The patch seems to work well.
eo_support
Posted: Thursday, September 19, 2019 4:26:37 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,218
Great. Thanks for confirming!


You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.