I have this kind of HTML document.
<span class="class1">text1</span>
<a href="">link1</a>
<font color=""><b>text2</b></font>
<a href="">link2</a>
text3
<span class="class2">text4</span>
And I'd like to surround text1, text2 and text3 by
s. What would be the best way? DomDocument cannot catch strings that are not tagged. For text1 and text2, getElementByTagName('tagname')->item(0)
can be used but for text 3, I'm not sure what to do.
Any ideas?
[Edit]
As Musa suggests, I tried using nextSibling.
<?php
$html = <<<STR
<span class="class1">text1</span>
<a href="">link1</a>
<font color=""><b>text2</b></font>
<a href="">link2</a>
text3
<span class="class2">text4</span>
STR;
$doc = new DOMDocument;
$doc->loadHTML($html);
foreach ($doc->getElementsByTagName('a') as $nodeA) {
$nodeA->nextSibling->nodeValue = ' ' . $nodeA->nextSibling->nodeValue . ' ';
}
echo $doc->saveHtml();
?>
However,
gets escaped and converted to &nbsp;
Since the setting the value seems to set it as text and not html you could use the non-breaking space character instead of the html entity.
<?php
$html = <<<STR
<span class="class1">text1</span>
<a href="">link1</a>
<font color=""><b>text2</b></font>
<a href="">link2</a>
text3
<span class="class2">text4</span>
STR;
$nbsp = "\xc2\xa0";
$doc = new DOMDocument;
$doc->loadHTML('<div>' . $html . '</div>');
foreach( $doc->getElementsByTagName('div')->item(0)->childNodes as $node ) {
if ($node->nodeType == 3) { // nodeType:3 TEXT_NODE
$node->nodeValue = $nbsp . $node->nodeValue . $nbsp;
}
}
echo $doc->saveHtml();
?>