Search code examples
javascriptdomparent-nodechild-nodes

How to remove every text from a website with Javascript


I want to have a Javascript function that removes every text from a website. The background is that in order to compare the appearance of the rendered DOM in difference browsers, I need to eliminate obvious differences before. As font rendering is a known difference, I want to remove every text. The solutions I found were always like this:

if(start.nodeType === Node.TEXT_NODE) 
{
    start.parentNode.removeChild(start);
}

But this only removes pure text nodes. I also want to find constructs like:

 <div>
        <p>
             <em>28.11.2014</em>
             <img></img>
                Testtext
             <span>
                <i>Testtext</i>
                Testtext
             </span>
        </p>
  </div>

Where the element containing text also contains children like or . That way, the element is not recognized as a text node.

So I basically want to turn the above DOM into this:

 <div>
        <p>
             <em></em>
             <img></img>
             <span>
                <i></i>
             </span>
        </p>
  </div>

Solution

  • You can try something like this.
    Demo

    HTML:

    <div id="startFrom">
        <p>
            <em>28.11.2014</em>
                <img></img>
                Testtext
            <span>
                <i>Testtext</i>
                Testtext
            </span>
        </p>
    </div>  
    

    JavaScript:

    var startFrom = document.getElementById("startFrom");
    
    function traverseDom(node) {
        node = node.firstChild;
        while (node) {
            if (node.nodeType === 3) {
                node.data = "";
            }
            traverseDom(node);
            node = node.nextSibling;
        }
    }
    
    traverseDom(startFrom);
    console.log(startFrom);