Welcome Guest Search | Active Topics | Sign In | Register

EO.Pdf sharing session cookies? Options
Bouke
Posted: Monday, May 28, 2018 7:58:22 AM
Rank: Newbie
Groups: Member

Joined: 2/8/2017
Posts: 6
We use the code below to generate a PDF document from a dynamic HTML page. The HTML page references images to be embedded. However sometimes it looks like a subsequent call of the code below re-uses the cookies from a former session. This is causing us major issues, as this might cause situations where unauthorised images might appear. We're on version 18.0.15 currently.

Code:

            var options = new HtmlToPdfOptions
            {
                PageSize = new SizeF(settings.PageWidth.MillimetresToInches(), settings.PageHeight.MillimetresToInches()),
                OutputArea = new RectangleF(settings.ContentLeft.MillimetresToInches(),
                settings.ContentTop.MillimetresToInches(),
                settings.ContentWidth.MillimetresToInches(),
                settings.ContentHeight.MillimetresToInches()),
                AutoFitX = settings.FitX.FromPdfAutoFitModeEnum(),
                MaxLoadWaitTime = MAX_LOAD_WAIT_TIME
            };

            HtmlToPdf.ConvertHtml(html, stream, options);


As a side-note, where can I find the release notes for EO.Pdf?
Bouke
Posted: Tuesday, May 29, 2018 8:54:22 AM
Rank: Newbie
Groups: Member

Joined: 2/8/2017
Posts: 6
So after some more digging into this, I have found the issue that's causing this. We pass rendered html into HtmlToPdf.ConvertHtml, where the rendered html references an image on a webserver. The webserver sets a cookie on the response, which is then stored with (the hidden) Chrome. The list returned by `session.GetCookies()` is empty, as the HTML page is a data-uri.

Code:

Source:
Line #: 1
Severity: Error
Uncaught SecurityError: Failed to read the 'cookie' property from 'Document': Access is denied for this document.


So we started looking into clearing the session after each use. However HtmlToPdf doesn't seem to provide options for this. There's a Stop method on the WebBrowser that is recommended, but it's not clear how to use WebBrowser in combination with HtmlToPdf. There is a callback somewhere defined that you can pass in a WebViewCallback, but stopping it with the code below, results in the following errors after a few concurrent requests (in our use-case we process many documents concurrently):

  • InvalidOperationException: WebView does not exist.
  • Exception: Browser engine failed to render page. Failed on command 4

    Code:

    using (var s = HtmlToPdfSession.Create())
    {
        s.LoadHtml(html);
        s.RenderAsPDF(stream);
        s.RunWebViewCallback(new EO.WebBrowser.WebViewCallback((webView, a) =>
        {
            webView.Engine.Stop(true);
            //webView.Engine.Start(); // tried with and without this
            return null;
        }), null);
    }


    Also stopping and starting the engine is incurring a significant performance degradation; on my development machine rendering a simple html page goes from 0.6s to 2.0s. So the question we have is the following:

    How can we perform html-to-pdf conversion with good performance, while also isolating between conversions? We need to clear the cache and cookies after each run, without tearing down the whole engine.
  • eo_support
    Posted: Tuesday, May 29, 2018 5:23:24 PM
    Rank: Administration
    Groups: Administration

    Joined: 5/27/2007
    Posts: 24,258
    Hi,

    There is no easy way to do that with the current version. The most effective way is to invoke the sign out page thus clears the cookies. However this depends on the web server. If you know exactly which cookie you need to remove, a more complex method is possible with a custom resource handler. The basic idea is as follow:

    Code: C#
    //Perform regular conversion
    s.RenderAsPDF(stream);
    
    //Use a custom resource handler to remove cookies. You will
    //need to implement RemoveCookieHandler. See comment below
    RemoveCookieHandler handler = new RemoveCookieHandler();
    s.RunWebViewCallback((webView, a) =>
    {
        webView.RegisterResouceHandler(handler);
        webView.LoadUrlAndWait("http://yoursite.com/logout");
        webView.UnregisterResouceHandler(handler);
    });


    Here the Url "http://yoursite.com/logout" is a fake Url that does not exist. You will need to intercept this Url in your RemoveCookieHandler and attach an expired cookie to response.Cookies collection in your custom resource's handler's Process method. Once this cookie is received by the browser engine, it will remove it from the cookie jar because the expired date is in the past. Note that you will also need to modify the Url so that it matches the actual site that you are trying to logoff.

    You can find more information on how to use custom resource handler here:

    https://www.essentialobjects.com/doc/webbrowser/advanced/resource_handler.aspx

    Please let us know if this works for you or if you have any question.

    Thanks!
    eo_support
    Posted: Tuesday, May 29, 2018 9:56:55 PM
    Rank: Administration
    Groups: Administration

    Joined: 5/27/2007
    Posts: 24,258
    Forgot to mention that you can find our change logs here:

    https://www.essentialobjects.com/ChangeLog.aspx

    This is the closest to the release notes.
    Bouke
    Posted: Wednesday, May 30, 2018 12:26:54 AM
    Rank: Newbie
    Groups: Member

    Joined: 2/8/2017
    Posts: 6
    Would it be possible to just disable all cookies, similar to "NoScript"?
    eo_support
    Posted: Wednesday, May 30, 2018 5:18:48 PM
    Rank: Administration
    Groups: Administration

    Joined: 5/27/2007
    Posts: 24,258
    Hi,

    We do not have a no cookie option. However we are looking into adding explicit cookie management interface through which you can add/delete cookies. We will reply here again when we have any update.

    Thanks!
    Bouke
    Posted: Thursday, May 31, 2018 2:13:04 PM
    Rank: Newbie
    Groups: Member

    Joined: 2/8/2017
    Posts: 6
    The proposed solution by hooking up a custom resource handler looks like a brittle solution. It also doesn't help me with scripts/images targeting websites outside my control (third-party cookies). So I'm really hoping to see a way to achieve complete isolation between invocations of `ConvertHtml`. So after each invocation: all cache is cleared, all cookies are removed and all other data is removed. Effectively "clear browser history" after each PDF is rendered. I'm fine with adding some overhead to the conversion, but ideally none. "Clear browser history" in Chrome is usually also very fast, especially when there's almost no history to be cleared.
    eo_support
    Posted: Friday, June 1, 2018 3:19:03 PM
    Rank: Administration
    Groups: Administration

    Joined: 5/27/2007
    Posts: 24,258
    Hi,

    We have posted a new build (18.1.91) that added explicit cookie access:

    https://www.essentialobjects.com/doc/eo.webengine.cookiemanager.aspx

    You can delete all the cookies by passing both null to both arguments (or ignore both arguments since the default values are null):

    https://www.essentialobjects.com/doc/eo.webengine.cookiemanager.deletecookies.aspx

    There is still no interface for clearing cache though.

    Thanks!
    Bouke
    Posted: Saturday, June 2, 2018 1:33:43 AM
    Rank: Newbie
    Groups: Member

    Joined: 2/8/2017
    Posts: 6
    Thank you for the response. This method is then to be called using the Callback method discussed earlier in this topic? And is it thread-safe -- meaning it'll only run for the WebEngine used by the current HtmlToPdf's session, and doesn't affect other simultaneous running instances?
    eo_support
    Posted: Saturday, June 2, 2018 8:39:38 AM
    Rank: Administration
    Groups: Administration

    Joined: 5/27/2007
    Posts: 24,258
    Yes. You can call it in the callback method and it's thread safe. It DOES affect other instances --- because multiple HtmlToPdfSession objects can share a single Engine object (internally it maintains a pool of Engine objects). If you want to completely isolate the engine object, you can use separate AppDomains.


    You cannot post new topics in this forum.
    You cannot reply to topics in this forum.
    You cannot delete your posts in this forum.
    You cannot edit your posts in this forum.
    You cannot create polls in this forum.
    You cannot vote in polls in this forum.