Search code examples
javascriptiframetextinnerhtmlpre

How to stop browser from encoding <, >, and & when reading a TXT file iframe in Web page?


I have noticed that browsers will encode left bracket <, right bracket >, and ampersand & into their respective codes (&lt;/&gt;/&amp;) when reading the innerHTML of a TXT-file source iframe. How do I stop this from happening? The innerHTML will also be wrapped in a <pre> tag as well.

For example, suppose I have the following inside a TXT (not HTML) file:

<div>
    Hello world! I love M&M's candy.
</div>

The following iframe:

<iframe id="MyIframe" src="/hello.txt"></iframe>

And the following JavaScript:

var MyIframe = document.getElementById('MyIframe');
alert(MyIframe.contentWindow.document.innerHTML);

The alert dialog box will pop up with the following:

<pre>
    &lt;div&gt;Hello world! I love M&amp;M's candy.&lt;/div&gt;
</pre>

How do I stop JavaScript from doing this with the content of the TXT file? I just want the raw, un-encoded content of the file.

I cannot use XMLHTTPRequest.


Solution

  • Instead of retrieving .innerHTML, retrieve .textContent. This should return only the human-readable text in an element tree, in a human-readable form. You can try it out on this page by Inspect-ing one of your HTML samples in your question, and writing $0.textContent in the developer console.