I want to create xml file which embed encoded xhtml. I has encoded xhtml file separately. During creating xml element, I would like to add the encoded content of xhtml in xml element, test
. After I add and echo the final output to browser, error shown in browser.
This page contains the following errors: error on line 9 at column 144: Encoding error Below is a rendering of the page up to the first error.
<?php
$dom =new DOMDocument('1.0','utf-8');
$content = (file_get_contents("test_xmlencoding.xhtml"));
$element = $dom->createElement('test', $content);
$dom->appendChild($element);
header('Content-type: text/xml;');
echo $dom->saveXML();
?>
XHTML file
<?xml version="1.0" ?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta content="TX21_HTM 21.0.406.501" name="GENERATOR" />
<title></title>
</head>
<body style="font-family:'Arial';font-size:12pt;text-align:left;">
<p lang="en-US" style="margin-top:0pt;margin-bottom:0pt;"><span style="font-family:'Verdana';font-size:9pt;">ABC1.</span></p>
<p lang="en-US" style="margin-top:0pt;margin-bottom:0pt;"><span style="font-family:'Verdana';font-size:9pt;">(ABC2)</span></p>
<p lang="en-US" style="margin-top:0pt;margin-bottom:0pt;"><span style="font-family:'Verdana';font-size:9pt;"> </span></p>
<p lang="en-US" style="margin-top:0pt;margin-bottom:0pt;"><span style="font-family:'Verdana';font-size:9pt;">ABC3</span></p>
</body>
</html>
When add xhtml content without encoding, the output render without error on browser.
I has try replaced
$content = (file_get_contents("test_xmlencoding.xhtml"));
to
$content = htmlentities(file_get_contents("test_xmlencoding.xhtml"));
The output show only the ending tag of test element, </test>
.
The second argument of DOMDocument::createElement()
and the DOMNode::$nodeValue
property have only a partial escaping. They expect special characters to be already escaped as entities - except <
and >
.
$document = new DOMDocument();
$document->appendChild(
$tests = $document->createElement('tests')
);
$tests
->appendChild($document->createElement('test', 'a < b'));
$tests
->appendChild($document->createElement('test', 'a & b'));
echo $document->saveXML();
Output:
Warning: DOMDocument::createElement(): unterminated entity reference b in ... on line 9
<?xml version="1.0"?>
<tests><test>a < b</test><test/></tests>
The method argument is not part of the DOM standard and the property behaves different from the specification.
In original DOM you where expected to add the content as a separate text node. This allows for mixed child nodes, too. Modern DOM introduced the DOMNode::$textContent
property which acts as a shortcut.
Here is an example:
$xhtml = <<<'XHTML'
<?xml version="1.0" ?>
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
<em>a & b</em>
</body>
</html>
XHTML;
$document = new DOMDocument();
$document->appendChild(
$tests = $document->createElement('tests')
);
// append child element and set its text content
$tests
->appendChild($document->createElement('test'))
->textContent = $xhtml;
// append child element, then append child text node
$tests
->appendChild($document->createElement('test'))
->appendChild($document->createTextNode($xhtml));
$document->formatOutput = true;
echo $document->saveXML();
Output:
Take note of the double escaped &amp;
.
<?xml version="1.0"?>
<tests>
<test><?xml version="1.0" ?>
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
<em>a &amp; b</em>
</body>
</html></test>
<test><?xml version="1.0" ?>
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
<em>a &amp; b</em>
</body>
</html></test>
</tests>