After a first xsl transformation I have a xml output similar to the following one:
<?xml version="1.0" encoding="UTF-8"?>
<analysis type="1">
<file path="a.txt">
<line nb="23" found="true"/>
<line nb="36" found="true" count="2"/>
<line nb="98" found="true"/>
</file>
<file path="a.txt">
<line nb="100" found="false"/>
</file>
<file path="b.txt">
<line nb="10" found="false"/>
</file>
<!-- more file nodes below with different @path -->
</analysis>
But now I need to obtain a second output where file
nodes are merged if they have the same path
attribute as follows:
<?xml version="1.0" encoding="UTF-8"?>
<analysis type="1">
<file path="a.txt">
<line nb="23" found="true"/>
<line nb="36" found="true" count="2"/>
<line nb="98" found="true"/>
<line nb="100" found="false"/>
</file>
<file path="b.txt">
<line nb="10" found="false"/>
</file>
</analysis>
I don't know possible @path
values in advance.
I looked at multiple posts about nodes merging but could not find a way to do what I want. I'm lost with nodes grouping, keys, id generation... and only obtained error messages so far.
Could you please help me to get the 2nd output starting from the first one (with xls 1.0) ? And if you could provide some references (websites) where I could find explanations about such kind of transformations it would be really great.
Note : the @nb
attribute of two line
nodes of two file
nodes having the same @path
never collide, it is unique, i.e. this will never happen :
<?xml version="1.0" encoding="UTF-8"?>
<analysis type="1">
<file path="a.txt">
<line nb="36" found="true" count="2"/>
</file>
<file path="a.txt">
<line nb="36" found="true"/>
</file>
</analysis>
Thank you a lot for your help !
Since you state in your question that you have trouble understanding keys, here is one way of doing it without keys, using a technique called sibling recursion. It is considered less good than using keys because it uses a the sibling axis, which is typically quite slow. However, in most practical situations, you will not notice the difference:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="node() | @*">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="analysis">
<xsl:copy>
<xsl:copy-of select="@*" />
<xsl:apply-templates select="file[not(preceding-sibling::file/@path = @path)]" mode="sibling-recurse" />
</xsl:copy>
</xsl:template>
<xsl:template match="file" mode="sibling-recurse">
<xsl:copy>
<!-- back to default mode -->
<xsl:apply-templates select="node() | @*" />
<xsl:apply-templates select="following-sibling::file[current()/@path = @path]" />
</xsl:copy>
</xsl:template>
<xsl:template match="file">
<xsl:apply-templates select="node()" />
</xsl:template>
</xsl:stylesheet>
This approach uses Münchian Grouping, which is explained elsewhere (just follow the tutorials like this one with this code in hand). It also uses the sibling axis, but in a far less destructive way (i.e., it is not required to traverse the whole sibling axis on every single node test).
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:key match="file" use="@path" name="path" />
<xsl:template match="node() | @*">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="analysis">
<xsl:copy>
<xsl:copy-of select="@*" />
<xsl:apply-templates select="file[generate-id(.) = generate-id(key('path', @path))]" mode="sibling-recurse" />
</xsl:copy>
</xsl:template>
<xsl:template match="file" mode="sibling-recurse">
<xsl:copy>
<!-- back to default mode -->
<xsl:apply-templates select="node() | @*" />
<xsl:apply-templates select="following-sibling::file[@path = current()/@path]/node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Note: for both approaches, the mode-switching is not entirely necessary, but it makes it easier to write simple match patterns and prevents priority conflicts or hard-to-find bugs (imo).