|
Rank: Newbie Groups: Member
Joined: 7/30/2018 Posts: 2
|
Hello I am generating a PDF from HTML markup using the following code:
Code: C#
HtmlToPdfOptions options = new HtmlToPdfOptions();
options.JpegQualityLevel = 50;
options.OutputArea = new RectangleF(0.5f, 0.5f, 7.5f, 9.25f);
PdfDocument doc = new PdfDocument();
for (int i = 0; i < pages.Length; i++)
{
string pageHtml = pages[i];
if (i == 0)
{
// first page
HtmlToPdf.ConvertHtml(pageHtml, doc, options);
}
else
{
PdfPage pdfPage = doc.Pages.Add();
HtmlToPdf.ConvertHtml(pageHtml, pdfPage);
}
}
The output file has 10 pages (which are rather simple) and if there are no img elements in the HTML then the total file size is 262kB, which is acceptable. However if there are img elements in the HTML then the file size increases significantly (e.g. to 13MB, while the total size of images is only 2MB). I tried changing the JpegQualityLevel to 1 or 0, but that barely affects the file size (which varies from 13MB to 13.5MB). With other images the PDF size gets even up to 45MB (where the total image files size is much less, e.g. 5MB). How can I reduce the impact of images on the total file size?
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,258
|
Hi, Based on your code you only apply your conversion option (JpegQualityLevel) to the first page but not other pages. So you may want to change that. Additionally, if you have the same image in your HTML, then the large size can be a result of the same image being stored the in the result PDF file multiple times (once per page). You can use the following strategy to avoid this:
Code: C#
//Convert each page into a separate PdfDocument object
PdfDocument[] docs = new PdfDocument[pages.Length];
for (int i = 0; i < pages.Length; i++)
{
HtmlToPdf.ConvertHtml(pages[i], docs[i]);
}
//Merge them into a single PdfDocument
PdfDocument result = PdfDocument.Merge(docs);
//Save the result
result.Save(file_pdf_file_name);
Please let us know if this reduces the file size for you. Thanks!
|
|
Rank: Newbie Groups: Member
Joined: 7/30/2018 Posts: 2
|
Thanks a lot. Applying options to each page individually as well as merging separate docs results in great size reduction and similar size (3MB in this case).
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,258
|
Great. Glad to hear that it works for you! Please feel free to let us know if you have any more questions.
|
|
Rank: Member Groups: Member
Joined: 5/25/2015 Posts: 21
|
Btw this only optimize image compression but if images are big in resolution wise it's different case. I have issue where customer is complaining that identical document created with MS Word is about 4-5x smaller and therefore result PDFs are too big to be sent as email attachment. With external PDF program I confirmed that ~99% of size is from images (compression rate already high) and after applying resolution reduction to 150ppi I was able to get that 4-5x reduction in size. Previously there was option to auto reduce image sizes (which was unusable in practise since ppi wasn't parametrized). Processing resolution could be quite trivial as: Quote: void PostProcessImages(object sender, PdfPageEventArgs args) { var images = args.Page.Contents .Flatten(c => c.Contents) .OfType<PdfImageContent>();
foreach (var image in images) { image.AutoScale(); //Or what ever custom logic with image.Image } }
... options.AfterRenderPage += PostProcessImages; ...
...buuut since EO handles PDF as write only and Page.Contents always return only raw content this won't work. Is there currently anyway to scale images or really read PDFs?
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,258
|
Hi,
There is no way to do image compression on existing files for now. Hopefully we can implement this feature in the future.
Thanks!
|
|