Search code examples
xsltsplitchainingxslt-1.0muenchian-grouping

XSLT split output files - muenchian grouping


I have an XSLT file so as to transform large amount of data. I would like to add a "split" functionality, either as a chained XSLT or within the current XSLT that can create multiple output files so as to limit the size of the files under a certain threshold. Let's assume that the input XML is as below:

<People>
<Person>             
<name>John</name>             
<date>June12</date>             
<workTime taskID="1">34</workTime>             
<workTime taskID="2">12</workTime>             
</Person>             
<Person>             
<name>John</name>             
<date>June13</date>             
<workTime taskID="1">21</workTime>             
<workTime taskID="2">11</workTime>             
</Person>
<Person>             
<name>Jack</name>             
<date>June19</date>             
<workTime taskID="1">20</workTime>             
<workTime taskID="2">30</workTime>             
</Person>    
</People>

The XSLT file is as below using muenchian grouping.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:key name="PersonTasks" match="workTime" use="concat(@taskID, ../name)"/>
<xsl:template match="/">
    <People>
    <xsl:apply-templates select="//workTime[generate-id() = generate-id(key('PersonTasks',concat(@taskID, ../name))[1])]"/>
    </People>
</xsl:template>

<xsl:template match="workTime">
    <xsl:variable name="taskID">
        <xsl:value-of select="@taskID"/>
    </xsl:variable>
    <xsl:variable name="name">
        <xsl:value-of select="../name"/>
    </xsl:variable>
    <Person>
        <name>
            <xsl:value-of select="$name"/>
        </name>
        <taskID>
            <xsl:value-of select="$taskID"/>
        </taskID>
        <xsl:for-each select="//workTime[../name = $name][@taskID = $taskID]">
            <workTime>
                <date>
                    <xsl:value-of select="../date"/>
                </date>
                <time>
                    <xsl:value-of select="."/>
                </time>
            </workTime>
        </xsl:for-each>
    </Person>
</xsl:template>
</xsl:stylesheet>

However, I'd like ,as an output, several files as below instead of a large one. For this example, I have set only one name per file..but this should be a parameter.

Output file for first person:

<People>
    <Person>
        <name>John</name>
        <taskID>1</taskID>
        <workTime>
        <date>June12</date>
        <time>34</time>
        </workTime>
        <workTime>
        <date>June13</date>
        <time>21</time>
        </workTime>
    </Person>
    <Person>
        <name>John</name>
        <taskID>2</taskID>
        <workTime>
        <date>June12</date>
        <time>12</time>
        </workTime>
        <workTime>
        <date>June13</date>
        <time>11</time>
        </workTime>
    </Person>
</People>

Output file for second person:

<People>
    <Person>
        <name>Jack</name>
        <taskID>1</taskID>
        <workTime>
        <date>June19</date>
        <time>20</time>
        </workTime>
    </Person>
    <Person>
        <name>Jack</name>
        <taskID>2</taskID>
        <workTime>
        <date>June19</date>
        <time>30</time>
        </workTime>
    </Person>
</People>

What would be the preferred and most elegant approach using XSLT 1.0? Is there a way to call a chained XSLT within the XSLT so as to split the output files?

Cheers.


Solution

  • Is there a way to call a chained XSLT within the XSLT so as to split the output files?

    A few ways:

    1. You could write an extension function to do this -- check the documentation of your XSLT processor.

    2. Use the <exsl:document> extension element of EXSLT, in case this is supported by your XSLT processor

    3. Use the <saxon:output> extension element if you have Saxon 6.x

    4. In a loop from your programming language invoke a separate transformation, passing to it as parameter the name of the person for which to produce results.

    Here are code examples for 2. and 3. above:

    Using <saxon:output> :

    <xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:saxon="http://icl.com/saxon"
      extension-element-prefixes="saxon" >
    
     <xsl:template match="/">
      <xsl:for-each select="/*/*[not(. > 3)]">
       <saxon:output href="c:\xml\doc{.}">
        <xsl:copy-of select="."/>
       </saxon:output>
      </xsl:for-each>
     </xsl:template>
    </xsl:stylesheet>
    

    when this transformation is applied on the following XML document:

    <nums>
      <num>01</num>
      <num>02</num>
      <num>03</num>
      <num>04</num>
      <num>05</num>
      <num>06</num>
      <num>07</num>
      <num>08</num>
      <num>09</num>
      <num>10</num>
    </nums>
    

    three files: c:\xml\doc1 , c:\xml\doc2 and c:\xml\doc3 are created with the wanted contents.

    The same example using <exslt:document>:

    <xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:ext="http://exslt.org/common"
      extension-element-prefixes="saxon" >
    
     <xsl:template match="/">
      <xsl:for-each select="/*/*[not(. > 3)]">
       <ext:document href="c:\xml\doc{.}">
        <xsl:copy-of select="."/>
       </ext:document>
      </xsl:for-each>
     </xsl:template>
    </xsl:stylesheet>