Search code examples
phpxhtmldomdocumentgetelementsbytagname

Why DomDocument getElementsByTagName give back an half NodeList?


I generate some non-standard-tag HTML with DomDocument and the result is this:

/* Input HTML
  <div id="toobar_top">
    <widget id="flag_holder"></widget>
    <widget id="horizontal_menu"></widget>
  </div>
  <div id="header">
    <widget name="header"></widget>
  </div>
*/

What I want to do is to "translate" each widget in something useful... they are simple placeholders with params.

The function extract from the class is:

private function widgeter($doc) { //Give it an DomDocument HTML containing <widget> elements and will translate them into usable stuff
    $this->_widgetList = $doc->getElementsByTagName($this->_widgetTransformTo);
    foreach ($this->_widgetList as $widget) {
        $data = array();
        if ($widget->hasAttributes()) {
        foreach ($widget->attributes as $attribute) {
            $data[][$attribute->name] = $attribute->value;
            // @TODO: Implements Widget Transformation

        }
        }
        // Next 2 lines are just for debug
        $string = serialize($data);
        $newWidget = $doc->createElement('p', $string);
        $widget->parentNode->replaceChild($newWidget, $widget);
    }
    return $doc;
    }

then when I saveHTML() the $doc I see:

/* Output HTML
  <div id="toobar_top">
    <p>[{"id":"flag_holder"}]</p>
    <widget id="horizontal_menu"></widget>
  </div>
  <div id="header">
    <p>[{"id":"header"}]</p>
  </div>
*/

why "horizontal_menu" wasn't translated?

It doesn't matter where widgets are (I tried with only one div with all widgets in and with a div per widget).

I can't figure out it...


Solution

  • It happens because you are replacing the elements in the DOMNodeList while looping on them. A DOMNodeList is not an array, so foreach does not operate on a copy, but on the object itself.

    Basically, what I think is happening is:

    • You replace the first instance of <widget> (Item 0).
    • The pointer advances to the next item (Item 1).
    • Item 0 has been replaced and does not exist anymore.
    • Item shifting occurs: Item 1 becomes Item 0, Item 2 becomes Item 1.
    • The pointer still points to Item 1 (which was originally Item 2, effectively skipping a node).

    What you need to do is to save the elements in an array and then change them, instead of looping on the DOMNodeList:

    $this->_widgetList = array();
    foreach ($domNodeList as $node) {
       $this->_widgetList[] = $node;    
    }
    
    foreach ($this->_widgetList as $widget) {
       // do stuff
    }