Search code examples
phpxmldomdocumentlimesurvey

Less than and greater than sign in DOMDocument XML .lsg


I'm trying to add some new elements to an lsg file (consisting of XML) with a PHP script. The lsg file is then imported into limesurvey. The problem is that I can't properly add characters such as < and >, which I need to add. They appear only as their entity references (e.g < and >) which doesn't work properly when importing to limesurvey. If I manually change the entity references to < and >

I've tried to use PHP DOMDocument to do this. My code looks similar to this:

$dom = new DOMDocument();
$dom->load('template.lsg');

$subquestions = $dom->getElementById('subquestions');

$newRow = $dom->createElement('row');
$subquestions->appendChild($newRow);

$properties[] = array('name' => 'qid', 'value' => "![CDATA[1]]");

foreach ($properties as $prop) {
    $element = $dom->createElement($prop['name']);
    $text = $dom->createTextNode($prop['value']);
    $startTag = $dom->createEntityReference('lt');
    $endTag = $dom->createEntityReference('gt');
    $element->appendChild($startTag);
    $element->appendChild($text);
    $element->appendChild($endTag);
    $supplier->appendChild($element);
}

$response = $dom->saveXML();
$dom->save('test.lsg');

The result of that one row is like this:

<row>
        <qid>&lt;![CDATA[7]]&lt;</qid>
</row>

While it should look like this:

<row>
    <qid><![CDATA[7]]></qid>
</row>

Any suggestions?


Solution

  • CDATA sections are a special kind of text nodes. They encode/decode a lot less and they keep leading/trailing whitespaces. So a DOM parser should read the same values from the following two example nodes:

    <examples>
      <example>text<example>
      <example><![CDATA[text]]]></example>
    </examples>
    

    To create a CDATA section use the DOMDocument::createCDATASection() method and append it like any other node. DOMNode::appendChild() returns the appended node, so you can nest the calls:

    $properties = [
       [ 'name' => 'qid', 'value' => "1"]
    ];
    
    $document = new DOMDocument();
    $subquestions = $document->appendChild(
        $document->createElement('subquestions')
    );
    
    // appendChild() returns the node, so it can be nested
    $row = $subquestions->appendChild(
      $document->createElement('row')
    );
    // append the properties as element tiwth CDATA sections
    foreach ($properties as $property) {
        $element = $row->appendChild(
            $document->createElement($property['name'])
        );
        $element->appendChild(
            $document->createCDATASection($property['value'])
        );
    }
    
    $document->formatOutput = TRUE;
    echo $document->saveXML();
    

    Output:

    <?xml version="1.0"?>
    <subquestions>
      <row> 
        <qid><![CDATA[1]]></qid>
      </row> 
    </subquestions>
    

    Using a normal text nodes works better most of the time.

    foreach ($properties as $property) {
        $element = $row->appendChild(
            $document->createElement($property['name'])
        );
        $element->appendChild(
            $document->createTextNode($property['value'])
        );
    }
    

    This can be optimized by using the DOMNode::$textContent property.

    foreach ($properties as $property) {
        $row->appendChild(
            $document->createElement($property['name'])
        )->textContent = $property['value'];
    }