Search code examples
phphtmldomdocument

Retrieve content from DOM->getElementById without element id


When using

$body = $dom->getElementById('content');

The output is the following:

<div id=content> 
  <div>
    <p>some text</p>
  </div>
</div>

I need to remove the <div id=content></div>part. Since i only need the inner part, excluding the div with id content

needed result:

<div>
   <p>some text</p>
</div>

My current code:

$url = 'myfile.html';
$file = file_get_contents($url);
$dom = new domDocument;
$dom->loadHTML($file);
//$body = $dom->getElementsByTagName('body')->item(0);
$body = $dom->getElementById('nbscontent');
$stringbody = $dom->saveHTML($body);
echo $stringbody;

Solution

  • getElementById returns a DOMElement which has the property childNodes which is a DOMNodeList. You can traverse through that to get the children and subsequently the innerHTML.

    $str = "<div id='test'><p>inside</p></div>";
    
    $dom = new DOMDocument();
    $dom->loadHTML($str);    
    $body = $dom->getElementById('test');
    
    $innerHTML = '';
    
    foreach ($body->childNodes as $child) 
    { 
        $innerHTML .= $body->ownerDocument->saveHTML($child);
    }
    
    echo $innerHTML; // <p>inside</p>
    

    Live Example

    Repl