I want to split an XML into smaller chunks for further processing. To be self-sufficient for import, each chunk must contain all the information outside their scope.
Let's say I configure < ITEM > as the new, all-containing root:
<ROOT>
<SUBROOT>
<MOVEME>moveme</MOVEME>
<ITEM>
<AAA>111</AAA>
</ITEM>
<ITEM>
<AAA>111</AAA>
</ITEM>
</SUBROOT>
</ROOT>
should become
<NEWROOT>
<ITEM>
<MOVEME>moveme</MOVEME>
<AAA>111</AAA>
</ITEM>
<ITEM>
<MOVEME>moveme</MOVEME>
<AAA>111</AAA>
</ITEM>
</NEWROOT>
What would be a high-performance-solution? Thanks!
Consider XSLT (sibling to XPath), the special-purpose language designed to transform XML files into other XML, HTML, even txt files. PHP can run XSLT 1.0 scripts with its php-xsl class (be sure to enable it in .ini file).
For small to medium-sized XML files, XSLT is a performant option as you even avoid foreach
loops and if
logic and rebuilding tree (i.e., importNode
) at the application layer. Specifically, below XSLT parses down to ITEM level and maps over the parent SUBROOT needed item using XPath's ancestor::*
. Adjust in script to actual node names.
XSLT (save as .xsl file, a special .xml file)
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes" />
<xsl:template match="/ROOT">
<NEWROOT>
<xsl:apply-templates select="SUBROOT"/>
</NEWROOT>
</xsl:template>
<xsl:template match="SUBROOT">
<xsl:apply-templates select="ITEM"/>
</xsl:template>
<xsl:template match="ITEM">
<xsl:copy>
<xsl:copy-of select="ancestor::SUBROOT/MOVEME" />
<xsl:copy-of select="*"/>
</xsl:copy>
</xsl:template>
</xsl:transform>
PHP
# LOAD XML AND XSL FILES
$xml = new DOMDocument;
$xml->load('/path/to/input.xml');
$xsl = new DOMDocument;
$xsl->load('/path/to/xslt_script.xsl');
// CONFIGURE TRANSFORMER
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
// RUN TRANSFORMATION
$newXML = new DOMDocument;
$newXML = $proc->transformToXML($xml);
// OUTPUT NEW XML
echo $newXML;
// SAVE NEW DOM TREE TO FILE
file_put_contents('/path/to/output.xml', $newXML);