Search code examples
xmlxsltxslt-1.0xslt-groupingmuenchian-grouping

How to organize(group) nodes under a closed element - XSLT


I have tried simple grouping XML with XSLT 1.0 and it worked, but here I have something more complicated and actually different situation. So the XML structure is basically this:

<Main>
 <TB>
    --> some elements and stuff - not relevant
   <City>
     <Area>
       <Position>5</Position>
       <House>

       --> some elements and stuff

       </House>
     </Area>
     <Area>
       <Position>5</Position>
       <Block>

       --> some elements and stuff

       </Block>
     </Area>
     <Area>
       <Position>6</Position>
       <House>

       --> some elements and stuff

       </House>
     </Area>
     <Area>
       <Position>6</Position>
       <Block>

       --> some elements and stuff

       </Block>
     </Area>
   </City>
   <City>

   --> same structure but with several repetitions of Position 7 and 8.

   </City>
 </TB>
</Main>

What I need is to group the Blocks and Houses which are under the same position and remove the repetition of Position numbers. For example it will get like this:

   <City>
     <Area>
       <Position>5</Position>
       <House>

       --> some elements and stuff

       </House>
       <Block>

       --> some elements and stuff

       </Block>
     </Area>
     <Area>
       <Position>6</Position>
       <House>

       --> some elements and stuff

       </House>
       <Block>

       --> some elements and stuff

       </Block>
     </Area>
   </City>
   <City>

   --> same structure for Position 7 and 8.

   </City>

It's harder because the Position is not an attribute of the Area, so I basically have to identify the value of the Position of the Area, then grab the House and Block that fall under the same Position, and put them together surrounded by the same <Area> </Area>.


Solution

  • This looks like a fairly standard Muenchian grouping problem to me, grouping Area elements (not House or Block elements directly) by their Position.

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
      <xsl:strip-space elements="*" />
      <xsl:output method="xml" indent="yes" />
    
      <xsl:key name="areaByPosition" match="Area" use="Position" />
    
      <xsl:template match="@*|node()">
        <xsl:copy><xsl:apply-templates select="@*|node()" /></xsl:copy>
      </xsl:template>
    
      <!-- for the first Area in each Position -->
      <xsl:template match="Area[generate-id() =
                                generate-id(key('areaByPosition', Position)[1])]">
        <Area>
          <!-- copy in the Position element once only -->
          <xsl:apply-templates select="Position" />
          <!-- copy in all sub-elements except Position from all matching Areas -->
          <xsl:apply-templates select="
                key('areaByPosition', Position)/*[not(self::Position)]" />
        </Area>
      </xsl:template>
    
      <!-- ignore all other Area elements -->
      <xsl:template match="Area" />
    </xsl:stylesheet>
    

    This assumes there are no other elements named Area elsewhere in the document, if any of the "some elements and stuff" may be named Area then you need to be a bit more specific, for example limiting the grouping to Area elements that are direct children of a City:

    <xsl:key name="areaByPosition" match="City/Area" use="Position" />
    
    <xsl:template match="City/Area[generate-id() =
                                   generate-id(key('areaByPosition', Position)[1])]"
                  priority="2">
      ...
    </xsl:template>
    
    <xsl:template match="City/Area" priority="1" />
    

    (with explicit priorities because without that both templates would have the same default priority of 0.5)