Search code examples
phpdom-node

How can I get text only from the current node with DOMElement?


<div>
     <a>abc</a>
     xyz
</div>

Given the above HTML structure, $divElement->nodeValue returns 'abc xyz', when I want to get 'xyz' only. $divElement->getAttribute('value') is empty.

How can I get 'xyz' without removing the <a> element?


Solution

  • Just iterate through the <div> and combine all text node:

    http://3v4l.org/fnTAF

    $dom=new DOMDocument;
    $dom->loadHTML(<<<HTML
    <div>
         <a>abc</a>
         xyz
    </div>
    HTML
    );
    $div=$dom->getElementsByTagName("div")->item(0);
    var_dump($div->childNodes->length);//just to debug
    $txt="";
    foreach(range(0,$div->childNodes->length-1) as $idx)
    {
        if($div->childNodes->item($idx)->nodeType==3)
        {
            $txt.=$div->childNodes->item($idx)->nodeValue;
        }
    }
    var_dump($txt);
    

    nodeType==3 means text node. The corresponding nodeName is #text.