Hello -- we have identified some pretty big performance issues with EO
Context:
We are a large company that run EO.Pdf HtmlToPdf.ConvertHtml -- we are typically running 16 generations in parallel and it can be a maximum of total between 1000-2000 PDFs generated.
Problem Summary:
1. Adding a HtmlHeader/Footer requires its own task, effectively doubling the required HtmlToPdf.MaxConcurrentTaskCount (this is not clear in the documentation based on what I could find). It also decreases throughput considerably (50%)
2. Parallel executions do not scale linearly with processor count. We see a performance degradation of approximately 50% / conversion in parallel vs. on a single thread -- we would expect a ratio closer to 1:1 (this accounts for coldstarts, etc.)
3. EO background threads consume large amounts of CPU, specifically EO.Base.ThreadRunnerBase. We expect there to be some sort of synchronous thread spin-locking going on that is the culprit here. This effect is multiplied when combined with problem 1. which requires double the background threads.
I am not sure how to post an image of from profiling the below benchmark code, but here is the summary of time spent in EO for one of our medium sized batches:
5.3M ms (36% of overall, ) thread time is spent in EO. This batch includes a lot of other business logic to gather data, process it, etc. so 36% of overall time inside EO is a huge chunk.
5.3M breakdown:
- 4.4M - inside EO.Base.ThreadRunnerBase (is this where the thread waiting is happening?)
- 0.384M - doing things with HtmlPdfSession (some method and RenderAsPdf) -- I assume this is actual work?
- 0.33M - doing things w/ WebView -- more actual work?
- 0.23M is spent in WaitableTask -- more spin-locking?
- 0.86M - in 4 background threads hitting EO.Internal (not sure, code is obfuscated)
Benchmark results:
- Single no header footer: 526.3ms
- Parallel no header footer: 844.6ms
- Single w/ header footer: 844.6ms
- Parallel w/ header footer: 1245.6ms
Benchmarking code
Code: C#
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Diagnosers;
using BenchmarkDotNet.Engines;
using BenchmarkDotNet.Exporters;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Running;
using BenchmarkDotNet.Validators;
using EO.Pdf;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace PwC.US.AM.TRACK.CLI.Forms.Test
{
public static class EOTest
{
public static readonly IConfig ColdStartConfig = ManualConfig.Create(DefaultConfig.Instance)
.AddJob(Job.InProcess.WithStrategy(RunStrategy.ColdStart).WithIterationCount(1))
.AddValidator(JitOptimizationsValidator.DontFailOnError)
.AddDiagnoser(MemoryDiagnoser.Default)
.AddExporter(MarkdownExporter.GitHub)
.WithOptions(ConfigOptions.DisableLogFile | ConfigOptions.DisableOptimizationsValidator)
;
public static readonly IConfig ThroughputConfig = ManualConfig.Create(DefaultConfig.Instance)
.AddJob(Job.InProcess.WithIterationCount(10).WithWarmupCount(1))
.AddValidator(JitOptimizationsValidator.DontFailOnError)
.AddDiagnoser(MemoryDiagnoser.Default)
.AddExporter(MarkdownExporter.GitHub)
.WithOptions(ConfigOptions.DisableLogFile | ConfigOptions.DisableOptimizationsValidator)
;
public static void Execute()
{
HtmlToPdf.MaxConcurrentTaskCount = Environment.ProcessorCount * 2; // I guess header footer requires their own task??
// measure cold start time.
BenchmarkRunner.Run<Benchmark>(ColdStartConfig);
//measure throughput
BenchmarkRunner.Run<Benchmark>(ThroughputConfig);
}
public class Benchmark
{
public static readonly string Html = HtmlResource.test;
private static readonly int[] _parallel = Enumerable.Range(0, Environment.ProcessorCount).ToArray();
[Params(false, true)]
public bool HeaderFooter { get; set; }
[Benchmark(Baseline = true)]
public void Single() => ConvertHtml();
[Benchmark]
public void Parallel()
{
_parallel.AsParallel().WithDegreeOfParallelism(Environment.ProcessorCount)
.ForAll(x => ConvertHtml())
;
}
private void ConvertHtml()
{
HtmlToPdf.Options.PageSize = new SizeF(8.5f, 11f);
float x = 0.5f, y = 0.8f, width = 7.5f, height = 9.6f;
HtmlToPdf.Options.OutputArea = new RectangleF(x, y, width, height);
// EO simple turns off header & footer manipulations
if (this.HeaderFooter)
{
HtmlToPdf.Options.HeaderHtmlFormat = "<div></div>";
HtmlToPdf.Options.FooterHtmlFormat = "<div></div>";
}
var sw = Stopwatch.StartNew();
var eoPdfDocument = new PdfDocument();
HtmlToPdf.ConvertHtml(Html, eoPdfDocument);
HtmlToPdf.ClearResult();
sw.Stop();
Console.WriteLine($"Html converted in {sw.Elapsed}");
}
}
}
}