Search code examples
phphtmldomdomdocument

PHP DOM function adding an extra <p> <html> and a <body> tags


I use the following function to show a different image to desktop and mobile user depending on their device.

My index.php file

<!DOCTYPE html>
<html class="no-js" lang="en">
<head>
    <meta charset="utf-8">
    <meta http-equiv="x-ua-compatible" content="ie=edge">
    <title>Testing Page</title>
</head>
<body>
<?php 
define("DEVICE", "desktop");
ob_start(); 
?>
<?php echo 'Lorem ipsum dolor sit amet, consectetur adipiscing elit.' . '<br/>'?> 
<?php echo 'Lorem ipsum dolor sit amet, consectetur adipiscing elit.' . '<br/>'?> 
<?php echo 'Lorem ipsum dolor sit amet, consectetur adipiscing elit.' . '<br/>'?> 
<?php echo 'Lorem ipsum dolor sit amet, consectetur adipiscing elit.' . '<br/>'?> 
<?php echo 'Lorem ipsum dolor sit amet, consectetur adipiscing elit.' . '<br/>'?> 
<?php echo 'Lorem ipsum dolor sit amet, consectetur adipiscing elit.' . '<br/>'?> 
<div>
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
<img src="/desktop-img/blog-1.png" alt="blog-1">
<img src="/desktop-img/blog-2.png" alt="blog-2">
<img src="/desktop-img/blog-3.png" alt="blog-3">
</div>
<?php
// Assign bufferred content to a variable for further processing
$content = ob_get_clean();

// Device specific images
function selectPaths($tag){

    // If paths is wrapped in <pre> or <code> tags
    if($tag->nodeName=="pre" || $tag->nodeName=="code"){
        return;
    // If not wrapped witihn <pre> or <code> tags
    } elseif($tag->nodeName=="img"){
        // Replace device specific path
        $tag->attributes->getNamedItem("src")->nodeValue=str_replace('desktop-img', DEVICE . '-img',$tag->attributes->getNamedItem("src")->nodeValue);
    } elseif($tag->hasChildNodes()){
        foreach($tag->childNodes as $child){
            selectPaths($child);
        }
    }
}

function deviceImages($content){

    $dom=new DOMDocument;
    $dom->preserveWhiteSpace=true;
    libxml_use_internal_errors(true);
    $dom->loadHTML($content);
    libxml_clear_errors();
    $root=$dom->documentElement;
    selectPaths($root);
    $dom->formatOutput=false;
    //Assign to variable
    $content = $dom->saveHTML($root);
    return $content;
}
$content = deviceImages ($content);
?>
<div id='wrapper'>
    <?php echo $content; ?>
</div>
</body>
</html>

My challenge:

This function is adding a <p> tag and also and extra <html><body> tags to my output.

My output image enter image description here

My output Source code

<!DOCTYPE html>
<html class="no-js" lang="en">
<head>
    <meta charset="utf-8">
    <meta http-equiv="x-ua-compatible" content="ie=edge">
    <title>Testing Page</title>
</head>
<body>
<div id='wrapper'>
    <html><body>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.<br> 
Lorem ipsum dolor sit amet, consectetur adipiscing elit.<br> 
Lorem ipsum dolor sit amet, consectetur adipiscing elit.<br> 
Lorem ipsum dolor sit amet, consectetur adipiscing elit.<br> 
Lorem ipsum dolor sit amet, consectetur adipiscing elit.<br> 
Lorem ipsum dolor sit amet, consectetur adipiscing elit.<br></p>
<div>
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
<img src="/desktop-img/blog-1.png" alt="blog-1"><img src="/desktop-img/blog-2.png" alt="blog-2"><img src="/desktop-img/blog-3.png" alt="blog-3">
</div>
</body></html></div>
</body>
</html>

My output source code image

enter image description here

My question:

How can I avoid this <p> <html> and <body> tags?

UPDATED

Updated as per the suggestion of @Aknosis about the <br/> tags.


Solution

  • The content you output is generated via DOMDocument's saveHTML method:

    $content = $dom->saveHTML($root);
    

    You reference the root node here, which is the documentElement which then is the parent element of that <html> element you do not want to output. So choose the correct element to output, e.g. the body of that document.

    $body = $doc->getElementsByTagName('body')->item(0);
    
    $content = implode(
        "",
        array_map([$doc, 'saveHTML'], iterator_to_array($body->childNodes))
    );
    
    echo $content;
    

    In your case, I think instead of the <body> element you take the first <p> element.

    For some related cases, a different approach might be necessary, there is also additional Q&A material here on site for that topic: