Search code examples
browserjavascriptuserscripts

Loading html from a same-domain link via userscript


I'm working on a JS browser userscript (for use with Violentmonkey etc.) which is supposed to add the seller's registration location to ebay.de/sch/... (and other Ebay domains') search result pages, simply by loading that info from each search result's product page.

I'm not very good at Javascript but I've had a friend write a function for me which takes an URL, makes a GET request and returns the html as a document (logging added by me):

  async function getUrlDocument(url) {
    console.log("url="+url)
    const response = await fetch(url,{headers:reqhead});
    if (response.ok)
    {
      var parser = new DOMParser();
      console.log("response URL="+await parser.parseFromString(await response, 'text/html').URL);
      return await parser.parseFromString(await response, 'text/html');
    }
    return null;
  }

The request headers look like this (I simply copied most of it from a normal browser request):

  const reqhead = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'X-Requested-With': 'XMLHttpRequest',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/118.0',
    'Host': 'www.ebay.de',
    'Referer': document.URL,
    'Cookie': document.cookie,
    'Sec-Fetch-Dest': 'document',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-Site':   'same-origin',
    'Sec-Fetch-User':   '?1',
    'TE':   'trailers',
    'Upgrade-Insecure-Requests': 1
  }

However, when testing it seems that the function consistently returns not the product page but rather just the search results page. Here's what the log says (parameter values redacted for privacy):

url=https://www.ebay.de/itm/184075911180?epid=[...]&hash=[...]
response URL=https://www.ebay.de/sch/i.html?_nkw=[search string]&_sacat=0&LH_TitleDesc=0&_odkw=[...]&_osacat=0

I don't understand why this happens, and neither does my friend. The /sch/ URL is just the page the script is running on. Why would the function return that instead of the one used in the fetch request?

My browser is Firefox.


Solution

  • So the URL being reported would be wrong even when the document is the correct one because the DOMParser wasn't given any URL, so parseFromString().URL returns the URL of the current page instead. But the returned URL doesn't really matter in my script, I was only using it for testing.

    Anyway, I'm using this now and it works fine:

      async function getUrlDocument(url) {
        return await fetch(url, { headers: reqhead })
          .then(response => response.text())
          .then(html => {
            var parser  = new DOMParser ();
            var redoc   = parser.parseFromString(html, "text/html");
            return redoc;
        })
          .catch(error => {
          throw new Error("Error: Failure to fetch page.");
        });
      }