I am trying to create a nested hierarchy from flat XML based on level
elements that represent a path. Each level
element and its belonging siblings (names and number vary) should be wrapped in a 'record' element thus creating a tree structure.
From this source (simplified):
<?xml version="1.0" encoding="UTF-8"?>
<record>
<level>first</level>
<unitid>0001</unitid>
<a-few-more-siblings/>
<level>first/second</level>
<unitid>0002</unitid>
<many-more-siblings/>
<level>first/second/third</level>
<unitid>0003a</unitid>
<some-more-siblings/>
<level>first/second/third</level>
<unitid>0003b</unitid>
<many-more-siblings/>
<level>first/second/third</level>
<unitid>0003c</unitid>
<some-more-siblings/>
<level>first</level>
<unitid>0004</unitid>
<again-more-siblings/>
</record>
I would like to generate the following desired output:
<Record level="first">
<level>first</level>
<unitid>001</unitid>
<a-few-more-siblings/>
<Record level="second">
<level>second</level>
<unitid>002</unitid>
<many-more-siblings/>
<Record level="third">
<level>third</level>
<unitid>003a</unitid>
<some-more-siblings/>
</Record>
<Record level="third">
<level>third</level>
<unitid>003b</unitid>
<many-more-siblings/>
</Record>
<Record level="third">
<level>third</level>
<unitid>003c</unitid>
<some-more-siblings/>
</Record>
</Record>
</Record>
<Record level="first">
<level>first</level>
<unitid>0004</unitid>
<again-more-siblings/>
</Record>
The closest I could produce so far is:
<record level="first">
<level>first</level>
<unitid>0001</unitid>
<some-other-siblings/>
<record level="second">
<level>first/second</level>
<unitid>0002</unitid>
<some-other-siblings/>
<record level="third">
<level>first/second</level>
<unitid>0002</unitid>
<some-other-siblings/>
<level>first/second/third</level>
<unitid>0003a</unitid>
<some-other-siblings/>
</record>
<record level="third">
<level>first/second</level>
<unitid>0002</unitid>
<some-other-siblings/>
<level>first/second/third</level>
<unitid>0003a</unitid>
<some-other-siblings/>
<level>first/second/third</level>
<unitid>0003b</unitid>
<some-other-siblings/>
</record>
<record level="third">
<level>first/second/third</level>
<unitid>0003c</unitid>
<some-other-siblings/>
</Record>
</record>
</record>
(undesirable siblings on third level additionally indented; 0004
on first level fails to appear)
I tried different variations of approaches suggested to similar problems ("flat to hierarchical", "following siblings until", etc.), but end up either stuck with too many siblings printed at a certain position or with the output of only the first record on the third level.
Any help is greatly appreciated.
One way to do this could be to make use of keys. For a start to get the siblings of a level
element you could define a key to group elements by the first most preceding level
element (i.e the group will be all the siblings).
<xsl:key name="siblings"
match="*[not(self::level)]"
use="generate-id(preceding-sibling::level[1])" />
You could also define a key to get the immediate 'descendant' of a level
element (i.e for each level, group them by the first most preceding level with a short name).
<xsl:key name="nextlevel"
match="level"
use="generate-id(preceding-sibling::level[starts-with(current(), concat(., '/'))][1])" />
In your XSLT you would then start of simply by selecting the 'first' level elements
<xsl:apply-templates select="level[. = 'first']" />
You would then have a generic template matching level
elements where you could utilise both the keys to output the siblings and the next level elements
<xsl:template match="level">
<Record level="{.}">
<xsl:copy-of select="." />
<xsl:apply-templates select="key('siblings', generate-id())" />
<xsl:apply-templates select="key('nextlevel', generate-id())" />
</Record>
</xsl:template>
Try the following XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:key name="siblings" match="*[not(self::level)]" use="generate-id(preceding-sibling::level[1])" />
<xsl:key name="nextlevel" match="level" use="generate-id(preceding-sibling::level[starts-with(current(), concat(., '/'))][1])" />
<xsl:template match="record">
<xsl:apply-templates select="level[. = 'first']" />
</xsl:template>
<xsl:template match="level">
<Record level="{.}">
<xsl:copy-of select="." />
<xsl:apply-templates select="key('siblings', generate-id())" />
<xsl:apply-templates select="key('nextlevel', generate-id())" />
</Record>
</xsl:template>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
When applied to your XML, the following is output
<Record level="first">
<level>first</level>
<unitid>0001</unitid>
<a-few-more-siblings/>
<Record level="first/second">
<level>first/second</level>
<unitid>0002</unitid>
<many-more-siblings/>
<Record level="first/second/third">
<level>first/second/third</level>
<unitid>0003a</unitid>
<some-more-siblings/>
</Record>
<Record level="first/second/third">
<level>first/second/third</level>
<unitid>0003b</unitid>
<many-more-siblings/>
</Record>
<Record level="first/second/third">
<level>first/second/third</level>
<unitid>0003c</unitid>
<some-more-siblings/>
</Record>
</Record>
</Record>
<Record level="first">
<level>first</level>
<unitid>0004</unitid>
<again-more-siblings/>
</Record>
This isn't quite what you are currently showing as your expected output, because your expected output has two 'first' level
elements wrapped in a single Record
element (compared with separate Record
elements for the 'third' level
elements). If your expected output is really what you expect, try replacing the template that matches record
with these two templates instead:
<xsl:template match="record">
<Record level="first">
<xsl:apply-templates select="level[. = 'first']" />
</Record>
</xsl:template>
<xsl:template match="level[. = 'first']">
<xsl:copy-of select="." />
<xsl:apply-templates select="key('siblings', generate-id())" />
<xsl:apply-templates select="key('nextlevel', generate-id())" />
</xsl:template>