Search code examples
javascriptdomhtml-parsingdomparser

javascript how to find that DOM node that contains a text?


Given a fetched html page, I want to find the specific node that contains a portion of text. The hard way I guess it would be to iterate to all the nodes one by one, getting as deep as it goes, and for each case do a search with e.g. .includes()

But how is the wise way? There must be something, but I'm unable to google correctly for it

    response = axios.get(url);
    let parsedHtml = parser.parseFromString(response.data, 'text/html');
    for (let i = 0; i < parsedHtml.children.length; i++)
       if (parsedHtml.children[i].textContent.includes('hello'))
          console.log(parsedHtml.children[i])

*it doesn't work

*Example code

<html>
 <body>
  <div>dfsdf</div>
  <div>
   <div>dfsdf</div>
   <div>dfsdf</div>
  </div>
  <div>
   <div>
    <div>hello</div>
   </div>
  </div>
  <div>dfsdf</div>
 </body>
 </html>

I would like to retrieve <div>hello</div> as a node element


Solution

  • After getting almost convinced that I had to traverse the DOM the classical way, I've found this here Javascript: How to loop through ALL DOM elements on a page? which is indeed excellent:

        let nodeIterator = document.createNodeIterator(
            parsedHtml,
            NodeFilter.SHOW_ELEMENT,
            (node) => {
                return (node.textContent.includes('mytext1')
                    || node.textContent.includes('mytext2'))
                    && node.nodeName.toLowerCase() !== 'script' // not interested in the script
                    && node.children.length === 0 // this is the last node
                    ? NodeFilter.FILTER_ACCEPT : NodeFilter.FILTER_REJECT;
            }
        );
        let pars = [];
        let currentNode;
    
        while (currentNode = nodeIterator.nextNode())
            pars.push(currentNode);
        console.log(pars[0].textContent); // for example