Search code examples
phpdomdomxpath

Php, DOMXpath. Invalid item(x) return


Simple thing but... We have such php code

$oPath = new \DOMXPath($this->oHtmlProperty);
$oNode = $oPath->query('//div[@class="product-spec__body"]');

foreach ($oNode as $oNodeProperty) {
    $oListTitle = $oPath->query('h2[@class="title title_size_22"]', $oNodeProperty);

    // ### VARIANT 1 (error with message 'Trying to get property of non-object')

    // $aPropertyGroup = [
    //     'title' => $oListTitle->item(0)->textContent,
    //     'property' => []
    // ];

    // ### VARIANT 2

    foreach ($oListTitle as $oListTitleItem){
        $aPropertyGroup = [
             'title' => $oListTitleItem->textContent,
             'property' => []
        ];

        break; // we need only first item
   }

// ....

So main thing that $oListTitle has always ->item(0) node and no more. When we try to get it we get error with message 'Trying to get property of non-object' but this node exist! When we do same thing but through iteration (return same node class as we call ->item(x)) we get what we need.

Can someone tell why? XD

ADDED:

$oListTitle is :

object(DOMNodeList)#340 (1) { ["length"]=> int(1) } 

ADDED:

var_dump($oListTitle->item(0)); return this one

object(DOMElement)#338 (18) { ["tagName"]=> string(2) "h2" ["schemaTypeInfo"]=> NULL ["nodeName"]=> string(2) "h2" ["nodeValue"]=> string(45) "ОÑновные характериÑтики" ["nodeType"]=> int(1) ["parentNode"]=> string(22) "(object value omitted)" ["childNodes"]=> string(22) "(object value omitted)" ["firstChild"]=> string(22) "(object value omitted)" ["lastChild"]=> string(22) "(object value omitted)" ["previousSibling"]=> NULL ["nextSibling"]=> string(22) "(object value omitted)" ["attributes"]=> string(22) "(object value omitted)" ["ownerDocument"]=> string(22) "(object value omitted)" ["namespaceURI"]=> NULL ["prefix"]=> string(0) "" ["localName"]=> string(2) "h2" ["baseURI"]=> NULL ["textContent"]=> string(45) "ОÑновные характериÑтики" } 

Another words not empty and exists.


Solution

  • I cannot reproduce the problem using php 5.6.3/win32 and the following code (your code + some boilerplate)

    <?php
    $foo = new Foo;
    var_export($foo->bar());
    
    class Foo {
    
        public function __construct() {
            $this->oHtmlProperty = new DOMDocument;
            $this->oHtmlProperty->loadhtml('<html><head><title>...</title></head><body>
        <div class="product-spec__body">
            <h2 class="title title_size_22">h2_1</h2>
            <h2 class="title title_size_22">h2_2</h2>
        </div>
        <div></div>
        <div class="product-spec__body">
            <h2 class="title title_size_22">h2_3</h2>
            <h2 class="title title_size_22">h2_4</h2>
        </div>
    </body></html>');
        }
    
        public function bar() {
            $retval = array(); $aPropertyGroup = array();
            $oPath = new \DOMXPath($this->oHtmlProperty);
            $oNode = $oPath->query('//div[@class="product-spec__body"]');
    
            foreach ($oNode as $oNodeProperty) {
                $oListTitle = $oPath->query('h2[@class="title title_size_22"]', $oNodeProperty);
                // ### VARIANT 1 (error with message 'Trying to get property of non-object')
                if ( !is_object($oListTitle) ) die('$oListTitle is not an object');
                if ( ! ($oListTitle instanceof DOMNodeList) ) die('$oListTitle is not a DOMNodeList');
                if ( $oListTitle->length < 1 ) die('oListTitle->length < 1');
                $node = $oListTitle->item(0);
                if ( is_null($node) ) die('$node is NULL');
                if ( !is_object($node) ) die('$node is not an object');
                if ( ! ($node instanceof DOMNode) ) die('$node is not a DOMNode');
    
                $aPropertyGroup = [
                    'title' => $oListTitle->item(0)->textContent,
                    'property' => []
                ];
    
                if ( !empty($aPropertyGroup) ) {
                    $retval[] = $aPropertyGroup;
                    $aPropertyGroup = array();
                }
            } 
    
            return $retval;
        }
    }
    

    the output is

    array (
      0 => 
      array (
        'title' => 'h2_1',
        'property' => 
        array (
        ),
      ),
      1 => 
      array (
        'title' => 'h2_3',
        'property' => 
        array (
        ),
      ),
    )
    

    as expected.
    But maybe libxml_get_last_error() can tell you more....