Search code examples
phpxmljsonxpathsimplexml

SimpleXMLElement using xpath dropping CDATA


I need to turn a node of an XML, recursively, into a json string. I have for the most part

$sku = "AC2061414";
$dom = new SimpleXMLElement(file_get_contents( "/usr/share//all_products.xml" )); 
$query = '//sku[text() = "'.$sku.'"]';
$entries = $dom->xpath($query);

foreach ($entries as $entry) {

    $parent_div = $entry->xpath( 'parent::*' );
    $nodearray=array();

    foreach($parent_div as $node) {
        if ($node->nodeType == XML_CDATA_SECTION_NODE) {
            $nodearray[$node->getName()]=$node->textContent;
        }else{
            $nodearray[$node->getName()]=$node;
        }
    }
    $ajax = json_encode( $nodearray );
    print($ajax);
}

Run on

<?xml version="1.0" encoding="UTF-8"?>
<products>
   <product active="1" on_sale="0" discountable="1">
    <sku>AC2061414</sku>
    <name><![CDATA[ALOE CADABRA ORGANIC LUBE PINA COLADA 2.5OZ]]></name>
    <description><![CDATA[ text text ]]></description>
    <keywords/>
    <price>7.45</price>
    <stock_quantity>30</stock_quantity>
    <reorder_quantity>0</reorder_quantity>
    <height>5.25</height>
    <length>2.25</length>
    <diameter>0</diameter>
    <weight>0.27</weight>
    <color></color>
    <material>aloe vera, vitamin E</material>
    <barcode>826804006358</barcode>
    <release_date>2012-07-26</release_date>
    <images>
      <image>/AC2061414/AC2061414A.jpg</image>
    </images>
    <categories>
      <category code="528" video="0" parent="0">Lubricants</category>
      <category code="531" video="0" parent="528">Flavored</category>
      <category code="28" video="0" parent="25">Oral Products</category>
      <category code="532" video="0" parent="528">Natural</category>
    </categories>
    <manufacturer code="AC" video="0">Aloe Cadabra Lubes</manufacturer>
    <type code="LU" video="0">Lubes</type>
  </product>
</products>

And ends with

{"product":{"@attributes":{"active":"1","on_sale":"0","discountable":"1"},"sku":"AC2061414","name":{},"description":{},"keywords":{},"price":"7.45","stock_quantity":"30","reorder_quantity":"0","height":"5.25","length":"2.25","diameter":"0","weight":"0.27","color":{},"material":"aloe vera, vitamin E","barcode":"826804006358","release_date":"2012-07-26","images":{"image":"\/AC2061414\/AC2061414A.jpg"},"categories":{"category":["Lubricants","Flavored","Oral Products","Natural"]},"manufacturer":"Aloe Cadabra Lubes","type":"Lubes"}}

Which seem ok except for the missing node values that were CDATA. I did try to account for it but it is not working. What is the trick here?


Solution

  • You can try adding LIBXML_NOCDATA option to the constructor.

    $dom = new SimpleXMLElement(file_get_contents( "/usr/share//all_products.xml" ), LIBXML_NOCDATA);
    ...
    

    More details here.