|
Rank: Advanced Member Groups: Member
Joined: 1/12/2015 Posts: 100
|
I'm trying this code: webView.LoadHtml(html, "https://www.bestbuy.ca/en-ca/product/samsung-hw-b650-3-1-channel-sound-bar-with-wireless-subwoofer/16000035") But I noticed that additional requests are coming in with a referer like this: https://www.bestbuy.ca/en-ca/product/samsung-hw-b650-3-1-channel-sound-bar-with-wireless-subwoofer/eo_loadhtml_87be039c-9b54-4ab9-bd3b-0b3b8bbfbfbb.htmlAlso, some of the Javascript that loads is supposed to read "16000035" from the original URL and construct new URL requests from it. Instead, it's making requests with "eo_loadhtml_87be039c-9b54-4ab9-bd3b-0b3b8bbfbfbb.html" instead of "16000035". It seems as if the LoadHtml function is not actually using the URL parameter specified and instead replacing the last part of the URL with "eo_loadhtml_<some guid>".
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,338
|
Hi, This is the designed behavior. BaseUrl and Url are not the same. BaseUrl are used to establish the page's security context and also for expanding partial Url to full Url. In your case, if your Url is:
Code:
https://www.bestbuy.ca/en-ca/product/samsung-hw-b650-3-1-channel-sound-bar-with-wireless-subwoofer/16000035
The your BaseUrl will be:
Code:
https://www.bestbuy.ca/en-ca/product/samsung-hw-b650-3-1-channel-sound-bar-with-wireless-subwoofer
Setting baseUrl to "https://www.bestbuy.ca/en-ca/product/samsung-hw-b650-3-1-channel-sound-bar-with-wireless-subwoofer/16000035" will cause the HTML to PDF converter to load a fake Url like this:
Code:
https://www.bestbuy.ca/en-ca/product/samsung-hw-b650-3-1-channel-sound-bar-with-wireless-subwoofer/16000035/eo_loadhtml_someguid.html
Note here "16000035" becomes part of the "base", which is obviously not intended here. If you wish to precisely mimic the behavior of loading that Url, you would have to use ConvertUrl with exactly that Url. However you still can use a custom ResourceHandler to alter the exact HTML contents that is loaded into the underlying WebView --- in fact this is exactly how LoadHtml with a baseUrl works. Thanks!
|
|
Rank: Advanced Member Groups: Member
Joined: 1/12/2015 Posts: 100
|
I had always thought that the LoadHtml just loaded the URL but replaced the response with the HTML provided. I'm only using the webview not the Pdf class, so I can't use the ConvertUrl. I guess I'll just have to make it use a custom ResourceHandler to do that.
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,338
|
Hi,
Yes. You would use LoadUrl with a custom resource handler instead. I thought you were using HtmlToPdf but the idea is the same for WebView since internally HtmlToPdf uses WebView. In this regard ConvertUrl is the same as LoadUrl. So this is the method you should use.
Thanks!
|
|
Rank: Advanced Member Groups: Member
Joined: 1/12/2015 Posts: 100
|
I'm trying the following code:
Code: C#
//
private class HtmlLoader_EOBrowser : ResourceHandler
{
private readonly string url, html;
public HtmlLoader_EOBrowser(string url, string html)
{
this.url = url;
this.html = html;
}
public override bool Match(Request request)
{
return string.Compare(request.Url, url, true) == 0;
}
public override void ProcessRequest(Request request, Response response)
{
response.ContentType = "text/html";
response.ContentEncoding = "utf-8";
response.Write(html);
}
}
var rh = new HtmlLoader_EOBrowser(url, html);
webView.RegisterResourceHandler(rh);
var nav = webView.LoadUrl(url);
nav.OnDone(() => webView.UnregisterResourceHandler(rh), false);
nav.WaitOne();
The initial URL is not actually being sent to the server as a request and no cookies are being sent. All I want to do is have the browser make the initial request and then replace the HTML from the response with the HTML I specify. So basically the browser will think it's on the correct URL but it will have the HTML I specify. Once it starts to render that, it will then render any other additional resources as normal. But that isn't happening in either my ResourceHandler code or with your LoadHtml method. What do you suggest?
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,338
|
Your code is correct. Can you try to clear the cache folder and then see if it works?
|
|
Rank: Advanced Member Groups: Member
Joined: 1/12/2015 Posts: 100
|
I tried clearing the cache folder but the initial request is never made to the server.
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,338
|
Hi, I am sorry I didn't understand your initial question correctly. You said: Quote:The initial URL is not actually being sent to the server as a request and no cookies are being sent. All I want to do is have the browser make the initial request and then replace the HTML from the response with the HTML I specify This is not possible with a single request. A single request either goes to the server or doesn't. When you use a resource handler to intercept it, it does not go to the server because you indicate that you wish to take over this resource request. There is no built in support to mix these two options like you said to send the request to the server first, then modify the response. The only way to do that is through a proxy server. You can then do any kind of interception/modification in your proxy server. Thanks!
|
|
Rank: Advanced Member Groups: Member
Joined: 1/12/2015 Posts: 100
|
I understand. Is there any plans to add this feature (without the use of proxies) to future versions of EO browser?
|
|
Rank: Administration Groups: Administration
Joined: 5/27/2007 Posts: 24,338
|
There is no plan to add that. The network layer and the parser/rendering layer are extremely closely integrated together for performance reasons. Inside the browser engine there is not a single point where it has the entire response HTML --- the responses are received in chunks from the network layer and parsed as they are received. You may see this in action when you have an extremely long HTML page where you see the top portion of the page has already been rendered before the entire page finishes loading. In another word, the only point when you can fully examine the entire response is after the page fnishes loading, however at that point some part of the page has already been rendered, which means if might already be too late.
|
|