Search code examples
javascripthtmlif-statementreplace

How do I replace words in an HTML page using JS?


I want to replace all words on a webpage that follow a set condition, with other words, which vary, depending on the replaced word. For example, let's say I want to replace all words that are longer than 5 letters, with another word, also longer than 5 letters, which has the same first letter as the word that is to be replaced, and, is also longer than 5 letters. All the words on the page that are longer than 6 letters will be one of 26 possibilities.

I tried to accomplish this by taking all of the HTML and replacing all words that were longer than 5 letters and outside a tag, but my code changed the hyperlink and added my JS script to the HTML page. Why does this happen? How do I accomplish what I want to?

const segmenter = new Intl.Segmenter("en",{granularity:"word"});
const text = [...segmenter.segment(document.body.innerHTML)];
let test = "";
let alphabets = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ".split("");
let replacements = "assignment behind controversial desperate eventually foundation growth happiness inside jurisdiction kingdom living million nightmare original president quantum revealed surprising travels understanding vacancies weather xenophobic yourself zombie Assignment Behind Controversial Desperate Eventually Foundation Growth Happiness Inside Jurisdiction Kingdom Living Million Nightmare Original President Quantum Revealed Surprising Travels Understanding Vacancies Weather Xenophobic Yourself Zombie".split(" ");

let intag = false;
for (var word of text){
if((word.segment=="<"||word.segment==">")&&!intag){
  intag=!intag
}
  test += ((word.segment.length<6)? word.segment:replacements[alphabets.indexOf(word.segment[0])])
}
document.body.innerHTML=test;
console.log(test);

<h1> Replacing words in a webpage </h1>
<a href="https://stackoverflow.com/posts/78633403">Why won't this work?</a>

Solution

  • Grab all the text nodes with createTreeWalker and then replace them with replaceWith.

    function get_text_nodes(el) {
      const nodes = [];
      const rejected_tags = new Set(["SCRIPT", "STYLE"]);
      const walker = document.createTreeWalker(el, NodeFilter.SHOW_TEXT, node => rejected_tags.has(node.parentNode.tagName) || !node.textContent ? NodeFilter.FILTER_REJECT : NodeFilter.FILTER_ACCEPT);
      while(walker.nextNode()) {
        nodes.push(walker.currentNode);
      }
      return nodes;
    }
    
    const segmenter = new Intl.Segmenter("en",{granularity:"word"});
    const replacements = Object.fromEntries("assignment behind controversial desperate eventually foundation growth happiness inside jurisdiction kingdom living million nightmare original president quantum revealed surprising travels understanding vacancies weather xenophobic yourself zombie Assignment Behind Controversial Desperate Eventually Foundation Growth Happiness Inside Jurisdiction Kingdom Living Million Nightmare Original President Quantum Revealed Surprising Travels Understanding Vacancies Weather Xenophobic Yourself Zombie".split(" ").map(word => [word[0], word]));
    
    for (const node of get_text_nodes(document.body)) {
      const words = [...segmenter.segment(node.textContent)];
      const text = words.map(word => word.segment.length<6 ? word.segment : replacements[word.segment[0]] ?? word.segment).join("");
      node.replaceWith(text);
    };
    <h1> Replacing words in a webpage </h1>
    <a href="https://stackoverflow.com/posts/78633403">Why won't this work?</a>
    <div>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</div>

    Also alphabets is unnecessary. Just create a lookup object from your replacement string.