Search code examples
javascriptdom

How can I get array of #text nodes of the html tree


I need to use all the #text elements of an html body as an array. The rich text can have various levels so I need to get to the lowest element. For example for the text below I'm expecting to have an array of 8 elements.

enter image description here

What is the name or tag or method to get the # text node?


Solution

  • You can recursively scan through the nodes and push the text nodes into an array.

    const textNodes = []
    
    function pushTextNode(node) {
      if (node.nodeName === "#text") {
        const nodeVal = node.nodeValue.trim();
        if (nodeVal) {
          textNodes.push(nodeVal);
        }
        return;
      }
      node.childNodes.forEach((childNode) => {
        pushTextNode(childNode)
      });
    }
    
    pushTextNode(document.querySelector("#root"));
    console.log(textNodes);
    <div id="root">
      <span>
        0
        <b>
          12<u>3</u>
        </b>
        <u>
          4<b>5</b>
        </u>
        <b>67</b>8<a href="#">9</a>
      </span>
    </div>