Search code examples
javascriptfetch-api

How to retrieve DOM from a fully loaded page with the Fetch API


I am working on creating a feature to make a site with paginated content, similar to an Amazon product page, compatible with infinite scrolling using the Fetch API. According to Amazon's specifications, you can navigate through pages by changing the value of the 'page' URL parameter to 2, 3, 4, and so on. I intend to use this mechanism to fetch the product information for the next page using the Fetch API without having to click. Additionally, I would like to handle this page's information as a Document Object Model (DOM).

As a result of my initial attempts, I created the following code and executed it from the Console in the developer tools:

fetch('https://www.amazon.com/s?k=python&page=3')
    .then(res => res.text())
    .then(text => new DOMParser().parseFromString(text, 'text/html'))
    .then(document => {
        console.log(document)
  });

However, when I examine the information returned in the console, I cannot find the expected product information within the DOM. It appears that the initial HTML loaded does not contain the product information and that this information is loaded dynamically afterward.

How to fetch dynamically loaded HTML information using the Fetch API?

I executed the above code.


Solution

  • However, when I examine the information returned in the console, I cannot find the expected product information within the DOM.

    You haven't done anything to add it to the DOM. You've parsed it into a document, but then just logged that document out without doing anything else with it.

    You have several options. If you really just want to add that HTML to the end of the page (for instance, the end of the body element), you don't have to explicitly parse the HTML, you could directly append it to the page by doing document.body.insertAdjacentHTML("beforeend", text); (which will parse it and insert it into the document).

    Side note: Your code is falling prey to the fetch API footgun I wrote about here — be sure to check that the HTTP call succeeded before calling response.text().

    So:

    fetch("https://example.com?page=3")
        .then((res) => {
            if (!res.sok) {
                throw new Error(`HTTP error ${res.status}`);
            }
            return res.text();
        })
        .then((text) => document.body.insertAdjacentHTML("beforeend", text))
        .catch((error) => {
            // ...handle/display error...
        });
    

    More: insertAdjacentHTML

    If you want to add it somewhere other than the end of body, just get the element you want to add it to, and use that instead of document.body.

    You'll want to make sure the HTML you send back is just content, not a full HTML page with its own head, etc.

    Alternatively, you could parse it as you have done, then use DOM methods on the returned document to find the content you need from it, and insert those elements into the main page's document via append, appendChild, insertBefore, insertAdjacentElement, etc.