Search code examples
phpxmlspecial-charactersdomdocument

PHP DOMDocument: what is the nicest way to safely add text to an element


When adding a string that might contain troublesome characters (eg &, <, >), DOMDocument throws a warning, rather than sanitizing the string.

I'm looking for a succinct way to make strings xml-safe - ideally something that leverages the DOMDocument library.

I'm looking for something better than preg_replace or htmlspecialchars. I see DOMDocument::createTextNode(), but the resulting DOMText object is cumbersome and can't be handed to DOMDocument::createElement().

To illustrate the problem, this code:

<?php 

$dom = new DOMDocument;
$dom->formatOutput = true;
$parent = $dom->createElement('rootNode');
$parent->appendChild( $dom->createElement('name', 'this ampersand causes pain & sorrow ') );
$dom->appendChild( $parent );
echo $dom->saveXml();

produces this result (see eval.in):

Warning: DOMDocument::createElement(): unterminated entity reference          sorrow in /tmp/execpad-41ee778d3376/source-41ee778d3376 on line 6
<?xml version="1.0"?>
<rootNode>
  <name>this ampersand causes pain </name>
</rootNode>

Solution

  • You will have to create the text node and append it. I described the problem in this answer: https://stackoverflow.com/a/22957785/2265374

    However you can extend DOMDocument and overload createElement*().

    class MyDOMDocument extends DOMDocument {
    
      public function createElement($name, $content = '') {
        $node = parent::createElement($name);
        if ((string)$content !== '') {
          $node->appendChild($this->createTextNode($content));
        }
        return $node;
      }
    
      public function createElementNS($namespace, $name, $content = '') {
        $node = parent::createElementNS($namespace, $name);
        if ((string)$content !== '') {
          $node->appendChild($this->createTextNode($content));
        }
        return $node;
      }
    }
    
    $dom = new MyDOMDocument();
    $root = $dom->appendChild($dom->createElement('foo'));
    $root->appendChild($dom->createElement('bar', 'Company & Son'));
    $root->appendChild($dom->createElementNS('urn:bar', 'bar', 'Company & Son'));
    
    $dom->formatOutput = TRUE;
    echo $dom->saveXml();
    

    Output:

    <?xml version="1.0"?>
    <foo>
      <bar>Company &amp; Son</bar>
      <bar xmlns="urn:bar">Company &amp; Son</bar>
    </foo>