Search code examples
phpxmlxml-parsingsimplexml

PHP: How to force simplexml to use certain datatype for a node


How can one enforce simplexml_load_string( ) to use same data structure at each node point.

$xml = "
<level1>
  <level2>
    <level3>Hello</level3>
    <level3>stackoverflow</level3>
  </level2>
  <level2>
    <level3>My problem</level3>
  </level2>
</level1>";

$xmlObj = simplexml_load_string($xml)
var_dump($xmlObj);

Examining the output,

level1 is an object; level2 is an array; level2[0] is an array.

level2[1] is an object, because there's only one child node, which I'll rather have as a single index array.

I'm collecting the xml from user, and there may be 1 or more nodes inside each level2. My sanitisation block is a foreach loop which fails when there's only one node inside level2.

The sanitation block looks something like this

foreach($xmlObj -> level2 as $lvl2){
  if($lvl2 -> level3[0] == 'condition'){ doSomething( ); }
}

doSomething() works fine when <level2> always has more than one child node in the xml string. If <level2> has only one child <level3> node, an error about trying to get attribute of a non-object comes up.

var_dump shows that the data type changes from object to array depending on how many nodes are nested within.

I'll prefer a way to ensure <level2> to always be an array regardless of how many children are within. That saves me from editing too much. But any other way out would suffice.

Thanks


Solution

  • It is not an information available in the XML itself. So you will have to add it in your implementation. SimpleXML provides both list and item access to a child elements. If you access it as a list (for example with foreach) it will provide all matching child elements.

    $xml = "
    <level1>
      <level2>
        <level3>Hello</level3>
        <level3>stackoverflow</level3>
      </level2>
      <level2>
        <level3>My problem</level3>
      </level2>
    </level1>";
    
    $level1 = new SimpleXMLElement($xml);
    
    $result = [];
    foreach($level1->level2 as $level2) {
        $data2 = [];
        foreach ($level2->level3 as $level3) {
            $data2[] = (string)$level3;
        }
        $result[] = $data2;
    }
    
    var_dump($result);
    

    So the trick is to use the SimpleXMLElement instance directly and not convert it into an array. Do not treat the creation of your JSON structure as a generic conversion. Build up a specific output while reading the XML using SimpleXML.