I'm new here. I try to explain you my problem.
I'm developing an extension for Chrome that manage DOM.
I have to split up each single word inside <p>
tag element, to apply after some css features on each word, but preserving other tag elements (<a>, <em>, <strong>
, etc.) that could be in <p>
tag.
Example of possible text in a web page:
<p>
Sed ut <a> perspiciatis unde omnis </a>
iste natus <em> error sit </em>
voluptatem <strong> accusantium </strong>
doloremque laudantium
</p>
Using jQuery, I've thought to put a <span>
tag around each word to define a class attribute to use with css.
I found this code that splits the words (belonging to <p>
) correctly but doesn't consider other possible elements inside <p>
.
Code used (that doesn't do what I need):
$("p").each(function() {
var originalText = $(this).text().split(' ');
var spannedText = [];
for (var i = 0; i < originalText.length; i += 1) {
if(originalText[i] != ""){
spannedText[i] = ('<span class="...">' + originalText.slice(i,i+1).join(' ') + '</span>');
}
}
$(this).html(spannedText.join(' '));
});
In the example shown above this codes generate the following output, removing the other tag elements:
<p>
<span>Sed</span>
<span>ut</span>
<span>perspiciatis</span>
<span>unde</span>
<span>omnis</span>
<span>iste</span>
<span>natus</span>
<span>error</span>
<span>sit</span>
<span>voluptatem</span>
<span>accusantium</span>
<span>doloremque</span>
<span>laudantium</span>
</p>
It is close to solution I need but in this case all the tags present in the example (<a>, <em>, <strong>
) are removed and substituted with <span>
tag.
Instead I want to keep the html structure of <p>
and insert only <span>...</span>
for each word.
This it the output I would like to achieve:
<p>
<span>Sed</span>
<span>ut</span>
<a> <span>perspiciatis</span> <span>unde</span> <span>omnis</span> </a>
<span>iste</span>
<span>natus</span>
<em> <span>error</span> <span>sit</span> </em>
<span>voluptatem</span>
<strong> <span>accusantium</span> </strong>
<span>doloremque</span>
<span>laudantium</span>
</p>
Can you help me?
Replacing HTML destroys all event listeners added in JavaScript to the child elements and makes the browser re-parse the entire thing which is a CPU-intensive operation so it can be slow on slower devices. Don't do this.
const span = document.createElement('span');
span.className = 'foo';
span.appendChild(document.createTextNode(''));
// these will display <span> as a literal text per HTML specification
const skipTags = ['textarea', 'rp'];
for (const p of document.getElementsByTagName('p')) {
const walker = document.createTreeWalker(p, NodeFilter.SHOW_TEXT);
// collect the nodes first because we can't insert new span nodes while walking
const textNodes = [];
for (let n; (n = walker.nextNode());) {
if (n.nodeValue.trim() && !skipTags.includes(n.parentNode.localName)) {
textNodes.push(n);
}
}
for (const n of textNodes) {
const fragment = document.createDocumentFragment();
for (const s of n.nodeValue.split(/(\s+)/)) {
if (s.trim()) {
span.firstChild.nodeValue = s;
fragment.appendChild(span.cloneNode(true));
} else {
fragment.appendChild(document.createTextNode(s));
}
}
n.parentNode.replaceChild(fragment, n);
}
}
Since we may be replacing thousands of nodes, this code tries to be fast: it uses TreeWalker API, DOM cloning, skipping the potentially superlong sequences of spaces and line breaks via a simple regular expression \s+
, and DocumentFragment to place the new nodes in one mutation operation. And of course not using jQuery.
P.S. There are advanced libraries for much more complex matching and processing like mark.js.