Search code examples
javascripthtmleventsiframebrowser-cache

How to read the *original* source of an iframe, after it has loaded, from the cache


The most generic way to describe this is that I want to fire only one network request, only after the client has seen a certain part of the viewport, and then use it and display it in an iframe, in the most efficient manner possible.

Given a DOM structured like so:

<!DOCTYPE html>
<html>
  <head />
  <body>
    ...
    <iframe loading="lazy" sandbox="" src="http://www.example.com" />
    <pre />
    ...
  </body>
</html>

I want to show the client, in the pre tag, what the source of the iframe above looks like. The iframe element may host an arbitrary document, it may be textual or binary, all that is known is that the browser can display it. The iframe's source URL is hosted on the same origin.

I am aiming to display what one would see by going to a Chromium "view-source:" URL, or similar.

Accessing the .contentWindow or .contentdocument properties may not be possible, as it is sandboxed entirely, but even if I could, the document's outerHTML would not be sufficient, and using an XMLSerializer would obviously change the output. Also, I believe that browsers are allowed to edit certain areas of a document, such as unnecessary whitespace, or formatting.

I had simply tried the following:

document
.body
.querySelector("iframe")
.addEventListener(
    "load",
    async ( { currentTarget: { src } } ) => {
        const data = await fetch(
            src, {
                cache: "only-if-cached"
            }
        );

        // ... use data
    }, {
        passive: true,
        once: true
    }
);

Yet, the fetch failed. It seemed that the URL was not in the browser's cache, but I did not want to initiate a new network request, is there an efficient way that I could do this?

I was thinking of using an Intersection Observer as a potential solution because it would result in only one network request, but the code was pretty long, and it seemed to not have been working correctly (I am inexperienced with the observer).


Solution

  • There are a few potential things going on here with the event listener and the fetch request. Note to other readers that lazy loading in iframes is only supported in Chrome/Chromium, Edge, and Opera as of this writing (November, 2020).

    This answer is aimed to the questioner who wants to also use lazy loading to dynamically load content. If the concern is just about making a single network request, it could be done via a fetch request and setting the src of the iframe via Blob or data URIs or via srcdoc, or using Service Workers (explored a little bit below).

    Frames Potentially Loading Before Adding Event Listeners

    You add an event listener to your frame after it's declared, which can create problems if the frame loads before the script is evaluated (which may be the case if it is cached).

    Normally, you could see if your frame is already loaded via the contentWindow or contentDocument property, and if so run your initialization code on those frames, but because the frame is sandboxed, those properties aren't accessible. What you can instead do is declare your handler ahead of time, and when you create your iframe declare it then:

    async function loader(frame) {
        console.log(frame.src);
        // ...
    }
    
    <iframe src="frame.html" loading="lazy" sandbox="" onload="loader(this)"></iframe>
    

    Using Same-Origin in fetch requests

    According to MDN, only-if-cached can only be used if the mode is set to same-origin, which should be okay in your situation since you said that the frame is the same origin. The fetch request would look something like:

    await fetch(src, { mode: 'same-origin', cache: 'only-if-cached' });
    

    The Result

    main.html

    <!doctype html>
    <html>
      <head>
        <meta charset="utf-8" />
        <style>
          .spacer {
            height: 200vh;
            background: lightgray;
            text-align: center;
          }
        </style>
        <script>
          async function loader(frame) {
            var sourceCodeElementId = frame.dataset.code;
            var sourceCodeElement = document.getElementById(sourceCodeElementId);
            var response = await fetch(frame.src, {
              mode: 'same-origin',
              cache: 'only-if-cached'
            });
            if (response.status === 504) {
              // if it didn't load via cache, just get it normally
              response = await fetch(frame.src);
            }
            var text = await response.text();
            sourceCodeElement.textContent = text;
          }
        </script>
      </head>
      <body>
        <div class="spacer">Scroll Down</div>
        <iframe src="frame.html"
            loading="lazy"
            data-code="frame-source-code"
            sandbox=""
            onload="loader(this)">
        </iframe>
        <pre id="frame-source-code"></pre>
      </body>
    </html>
    

    Sample frame.html

    <!doctype html>
    <html>
    Frame
    </html>
    

    Optionally/Alternatively Use Service Workers

    If the concern is around network requests, you can use Service Workers, alternatively, or additionally to the above fixes. It takes a bit of work to get them up and running, but they can give you a finer control over network requests. Caching files is a common tutorial, so you should be able to find a decent amount of resources if you go that route.