Search code examples
phpdomdocument

php DOMDocument: element ending up within another


I have some HTML that contains (among other things) p-tags and figure-tags that contain one img-tag.
For the sake of simplicity I'll define an example of what can be found in the HTML here in a PHP variable:

$content = '<figure class="image image-style-align-left">
<img src="https://placekitten.com/g/200/300"></figure>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</p>';

I use DOMDocument to get $content and in this example I'll change the src attribute of all img-elements within a figure-element:

$dom = new DOMDocument();
libxml_use_internal_errors(true);

// this needs to be encoded otherwise special characters get messed up.
$domPart = mb_convert_encoding($content, 'HTML-ENTITIES', "UTF-8");
$dom->loadHTML($domPart, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

$domFigures = $dom->getElementsByTagName('figure');

foreach ($domFigures as $domFigure) {

    $img = $domFigure->getElementsByTagName('img')[0];
    if ($img) {
        $img->setAttribute('src', "https://placekitten.com/g/400/500");
    }

}

$result = $dom->saveHTML();

The result is:

<figure class="image image-style-align-left">
<img src="https://placekitten.com/g/400/500">
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</p>
</figure>

Somehow my p-element has moved into my figure-element. Why does this happen and what can I do to prevent it?

Live DEMO


Solution

  • A DomDocument has to have a single root element, so it will move all following siblings inside the first top-level element.

    You could most easily address this by bookending your content with a container tag e.g.

    $content = '<div><figure class="image image-style-align-left">
    <img src="https://placekitten.com/g/200/300"></figure>
    <p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</p></div>';