|
Rank: Newbie Groups: Member
Joined: 2/8/2017 Posts: 6
|
We use the code below to generate a PDF document from a dynamic HTML page. The HTML page references images to be embedded. However sometimes it looks like a subsequent call of the code below re-uses the cookies from a former session. This is causing us major issues, as this might cause situations where unauthorised images might appear. We're on version 18.0.15 currently.
Code:
var options = new HtmlToPdfOptions { PageSize = new SizeF(settings.PageWidth.MillimetresToInches(), settings.PageHeight.MillimetresToInches()), OutputArea = new RectangleF(settings.ContentLeft.MillimetresToInches(), settings.ContentTop.MillimetresToInches(), settings.ContentWidth.MillimetresToInches(), settings.ContentHeight.MillimetresToInches()), AutoFitX = settings.FitX.FromPdfAutoFitModeEnum(), MaxLoadWaitTime = MAX_LOAD_WAIT_TIME };
HtmlToPdf.ConvertHtml(html, stream, options);
As a side-note, where can I find the release notes for EO.Pdf?
|
|
Rank: Newbie Groups: Member
Joined: 2/8/2017 Posts: 6
|
So after some more digging into this, I have found the issue that's causing this. We pass rendered html into HtmlToPdf.ConvertHtml, where the rendered html references an image on a webserver. The webserver sets a cookie on the response, which is then stored with (the hidden) Chrome. The list returned by `session.GetCookies()` is empty, as the HTML page is a data-uri.
Code:
Source: Line #: 1 Severity: Error Uncaught SecurityError: Failed to read the 'cookie' property from 'Document': Access is denied for this document.
So we started looking into clearing the session after each use. However HtmlToPdf doesn't seem to provide options for this. There's a Stop method on the WebBrowser that is recommended, but it's not clear how to use WebBrowser in combination with HtmlToPdf. There is a callback somewhere defined that you can pass in a WebViewCallback, but stopping it with the code below, results in the following errors after a few concurrent requests (in our use-case we process many documents concurrently): InvalidOperationException: WebView does not exist. Exception: Browser engine failed to render page. Failed on command 4
Code:
using (var s = HtmlToPdfSession.Create()) { s.LoadHtml(html); s.RenderAsPDF(stream); s.RunWebViewCallback(new EO.WebBrowser.WebViewCallback((webView, a) => { webView.Engine.Stop(true); //webView.Engine.Start(); // tried with and without this return null; }), null); }
Also stopping and starting the engine is incurring a significant performance degradation; on my development machine rendering a simple html page goes from 0.6s to 2.0s. So the question we have is the following:
How can we perform html-to-pdf conversion with good performance, while also isolating between conversions? We need to clear the cache and cookies after each run, without tearing down the whole engine.
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,258
|
Hi, There is no easy way to do that with the current version. The most effective way is to invoke the sign out page thus clears the cookies. However this depends on the web server. If you know exactly which cookie you need to remove, a more complex method is possible with a custom resource handler. The basic idea is as follow:
Code: C#
//Perform regular conversion
s.RenderAsPDF(stream);
//Use a custom resource handler to remove cookies. You will
//need to implement RemoveCookieHandler. See comment below
RemoveCookieHandler handler = new RemoveCookieHandler();
s.RunWebViewCallback((webView, a) =>
{
webView.RegisterResouceHandler(handler);
webView.LoadUrlAndWait("http://yoursite.com/logout");
webView.UnregisterResouceHandler(handler);
});
Here the Url "http://yoursite.com/logout" is a fake Url that does not exist. You will need to intercept this Url in your RemoveCookieHandler and attach an expired cookie to response.Cookies collection in your custom resource's handler's Process method. Once this cookie is received by the browser engine, it will remove it from the cookie jar because the expired date is in the past. Note that you will also need to modify the Url so that it matches the actual site that you are trying to logoff. You can find more information on how to use custom resource handler here: https://www.essentialobjects.com/doc/webbrowser/advanced/resource_handler.aspxPlease let us know if this works for you or if you have any question. Thanks!
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,258
|
Forgot to mention that you can find our change logs here: https://www.essentialobjects.com/ChangeLog.aspxThis is the closest to the release notes.
|
|
Rank: Newbie Groups: Member
Joined: 2/8/2017 Posts: 6
|
Would it be possible to just disable all cookies, similar to "NoScript"?
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,258
|
Hi,
We do not have a no cookie option. However we are looking into adding explicit cookie management interface through which you can add/delete cookies. We will reply here again when we have any update.
Thanks!
|
|
Rank: Newbie Groups: Member
Joined: 2/8/2017 Posts: 6
|
The proposed solution by hooking up a custom resource handler looks like a brittle solution. It also doesn't help me with scripts/images targeting websites outside my control (third-party cookies). So I'm really hoping to see a way to achieve complete isolation between invocations of `ConvertHtml`. So after each invocation: all cache is cleared, all cookies are removed and all other data is removed. Effectively "clear browser history" after each PDF is rendered. I'm fine with adding some overhead to the conversion, but ideally none. "Clear browser history" in Chrome is usually also very fast, especially when there's almost no history to be cleared.
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,258
|
Hi, We have posted a new build (18.1.91) that added explicit cookie access: https://www.essentialobjects.com/doc/eo.webengine.cookiemanager.aspxYou can delete all the cookies by passing both null to both arguments (or ignore both arguments since the default values are null): https://www.essentialobjects.com/doc/eo.webengine.cookiemanager.deletecookies.aspxThere is still no interface for clearing cache though. Thanks!
|
|
Rank: Newbie Groups: Member
Joined: 2/8/2017 Posts: 6
|
Thank you for the response. This method is then to be called using the Callback method discussed earlier in this topic? And is it thread-safe -- meaning it'll only run for the WebEngine used by the current HtmlToPdf's session, and doesn't affect other simultaneous running instances?
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,258
|
Yes. You can call it in the callback method and it's thread safe. It DOES affect other instances --- because multiple HtmlToPdfSession objects can share a single Engine object (internally it maintains a pool of Engine objects). If you want to completely isolate the engine object, you can use separate AppDomains.
|
|