Search code examples
xsltgroupingxslt-1.0muenchian-groupingxslkey

xslt 1.0, select group of nodes with key


I want to select nodes based on some variables. The XML code:

<data>
    <prot seq="AAA">
        <node num="1">1345</node>
        <node num="1">11245</node>
        <node num="2">88885</node>
    </prot>
    <prot seq="BBB">
        <node num="1">678</node>
        <node num="1">456</node>
        <node num="2">6666</node>
    </prot>
    <prot seq="CCC">
        <node num="1">111</node>
        <node num="1">222</node>
        <node num="2">333</node>
    </prot>
</data>

The XML that I want

<output>
    <prot seq="AAA">
        <node num="1">1345</node>
        <node num="2">88885</node>
    </prot>
    <prot seq="BBB">
        <node num="1">678</node>
        <node num="2">6666</node>
    </prot>
    <prot seq="CCC">
        <node num="1">111</node>
        <node num="2">333</node>
    </prot>
</data>

So, my idea has been to group the nodes with a xsl:key element, and then do a for-each of them. For example:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
    <xsl:key name="by" match="/data/prot" use="concat(@seq,'|',node/@num)"/>
    <xsl:template match="/">
        <root>
            <xsl:apply-templates select="/data/prot"/>
        </root>
    </xsl:template>
    <xsl:template match="/data/prot">
        <xsl:for-each select="./node">
            <xsl:for-each select="key('by',concat(current()/../@seq,'|',current()/@num))">
                node <xsl:value-of select="./node" />
            </xsl:for-each>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

but the output is not what I expected, and I cannot see what I am doing wrong. I would prefer to keep the for-each structure. It is just as if I was not using properly the xsl:key grouping features.

the output that I get, unwanted

<root>
    node 1345
    node 1345
    node 678
    node 678
    node 111
node 111</root>

And the code as it to be tested http://www.xsltcake.com/slices/sgWUFu/20

Thanks!


Solution

  • The main problem in your code is that the key indexes prot elements, but what we want to de-duplicate (and need to index) is the node elements.

    Here is a short and correct solution:

    <xsl:stylesheet version="1.0" 
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
     <xsl:output omit-xml-declaration="yes" indent="yes"/>
     <xsl:strip-space elements="*"/>
    
     <xsl:key name="nodeByParentAndNum" match="node"
      use="concat(generate-id(..), '+', @num)"/>
    
     <xsl:template match="node()|@*">
      <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
      </xsl:copy>
     </xsl:template>
    
     <xsl:template match="/*">
      <data>
       <xsl:apply-templates/>
      </data>
     </xsl:template>
    
     <xsl:template match=
     "node
       [not(generate-id()
           =
            generate-id(key('nodeByParentAndNum',
                            concat(generate-id(..), '+', @num)
                            )
                             [1]
                        )
           )
       ]
     "/> 
    </xsl:stylesheet>
    

    when this transformation is applied on the provided XML document:

    <data>
        <prot seq="AAA">
            <node num="1">1345</node>
            <node num="1">11245</node>
            <node num="2">88885</node>
        </prot>
        <prot seq="BBB">
            <node num="1">678</node>
            <node num="1">456</node>
            <node num="2">6666</node>
        </prot>
        <prot seq="CCC">
            <node num="1">111</node>
            <node num="1">222</node>
            <node num="2">333</node>
        </prot>
    </data>
    

    the wanted, correct result is produced:

    <data>
       <prot seq="AAA">
          <node num="1">1345</node>
          <node num="2">88885</node>
       </prot>
       <prot seq="BBB">
          <node num="1">678</node>
          <node num="2">6666</node>
       </prot>
       <prot seq="CCC">
          <node num="1">111</node>
          <node num="2">333</node>
       </prot>
    </data>