Search code examples
xslt-2.0for-each-group

for-each-group based on position


I have a XML with this format:

<rows>
    <row>
        <gSubAssetType>
            <mSubAssetType>
                <SubAssetType>126</SubAssetType>
                <sgSubAssCou>
                    <sSubAssCou>
                        <SubAssScno>0000205282</SubAssScno>
                    </sSubAssCou>
                    <sSubAssCou>
                        <SubAssScno>0002000294</SubAssScno>
                    </sSubAssCou>
                    <sSubAssCou>
                        <SubAssScno>0000203622</SubAssScno>
                    </sSubAssCou>
                </sgSubAssCou>
                <sgTest>
                    <sTest>
                        <Campo1>Value1</Campo1>
                    </sTest>
                    <sTest>
                        <Campo1>Value2</Campo1>
                    </sTest>
                </sgTest>
            </mSubAssetType>
            <mSubAssetType>
                <SubAssetType>125</SubAssetType>
                <sgSubAssCou>
                    <sSubAssCou>
                        <SubAssScno>0000209645</SubAssScno>
                    </sSubAssCou>
                    <sSubAssCou>
                        <SubAssScno>0000204835</SubAssScno>
                    </sSubAssCou>
                    <sSubAssCou>
                        <SubAssScno>0000208014</SubAssScno>
                    </sSubAssCou>
                    <sSubAssCou>
                        <SubAssScno>0000208854</SubAssScno>
                    </sSubAssCou>
                </sgSubAssCou>
            </mSubAssetType>
            <mSubAssetType>
                <SubAssetType>1</SubAssetType>
                <sgSubAssCou>
                    <sSubAssCou>
                        <SubAssScno>0000208850</SubAssScno>
                    </sSubAssCou>
                </sgSubAssCou>
            </mSubAssetType>
        </gSubAssetType>
    </row>
</rows>

I want to extract the data in this format:

SubAssetType_SubAssCou;1;2;0002000294;
SubAssetType_SubAssCou;1;3;0000203622;
SubAssetType_SubAssCou;2;1;0000209645;
SubAssetType_SubAssCou;2;2;0000204835;
SubAssetType_SubAssCou;2;3;0000208014;
SubAssetType_SubAssCou;2;4;0000208854;
SubAssetType_SubAssCou;3;1;0000208850;
SubAssetType_Test;1;1;Value1;
SubAssetType_Test;1;2;Value2;

I want to extract the nodes that start with <s (<sSubAssCou> and <sTest>) keeping track of the position of the node inside the parent, and the position of the parent inside its parent. I could do this with some nested for-each, but I need to use also <xsl:result-document> to write the output to a file, and Saxon give an error when trying to output to the same document twice. So I am trying to use for-each-group, but I don't see how to keep the position of the node mSubAssetType for each sSubAssCou Any ideas how I can do that?

This is the template I am using

<xsl:template match="rows">
    <xsl:for-each select="row/*[starts-with(name(),'g')]">
        <xsl:variable name="tablePrefix" select="substring(name(),2)"/>
        <xsl:for-each select="*[starts-with(name(),'m')]">
            <xsl:variable name="mvIndex" select="position()"/>
            <xsl:for-each select="*[starts-with(name(),'sg')]">
                <xsl:variable name="tableSubPrefix" select="substring(name(),3)"/>
                <xsl:for-each select="*[starts-with(name(),'s')]">
                    <xsl:variable name="svIndex" select="position()"/>
                    <xsl:value-of select="$tablePrefix"/>_<xsl:value-of select="$tableSubPrefix"/>;<xsl:value-of select="$mvIndex"/>;<xsl:value-of select="$svIndex"/>;<xsl:value-of select="./*"/>;
                </xsl:for-each>
            </xsl:for-each>
        </xsl:for-each>
    </xsl:for-each>
</xsl:template>

Which works fine, but if I change it to write to a file based on the table prefix and suffix using result-document, as follows:

<xsl:template match="rows">
    <xsl:for-each select="row/*[starts-with(name(),'g')]">
        <xsl:variable name="tablePrefix" select="substring(name(),2)"/>
        <xsl:for-each select="*[starts-with(name(),'m')]">
            <xsl:variable name="mvIndex" select="position()"/>
            <xsl:for-each select="*[starts-with(name(),'sg')]">
                <xsl:variable name="tableSubPrefix" select="substring(name(),3)"/>
                <xsl:result-document href="t_{$tablePrefix}_{$tableSubPrefix}.txt" method="text">
                <xsl:for-each select="*[starts-with(name(),'s')]">
                    <xsl:variable name="svIndex" select="position()"/>
                    <xsl:value-of select="$tablePrefix"/>_<xsl:value-of select="$tableSubPrefix"/>;<xsl:value-of select="$mvIndex"/>;<xsl:value-of select="$svIndex"/>;<xsl:value-of select="./*"/>;
                </xsl:for-each>
                </xsl:result-document>
            </xsl:for-each>
        </xsl:for-each>
    </xsl:for-each>
</xsl:template>

Then it fails with this error

Error at xsl:result-document on line 14 of generic-extract-test.xsl:

XTDE1490: Cannot write more than one result document to the same URI:
file:/xxxx/t_SubAssetType_SubAssCou.txt
in built-in template rule
Cannot write more than one result document to the same URI: file:/xxxx/t_SubAssetType_SubAssCou.txt

It seems you cannot use the same result-document twice. So the question is how I group these nodes so I can process them grouped by the node name, but keeping the mvIndex and svIndex


Solution

  • Using xsl:result-document and xsl:number:

    <?xml version="1.0" encoding="utf-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      version="3.0"
      xmlns:xs="http://www.w3.org/2001/XMLSchema"
      exclude-result-prefixes="#all"
      expand-text="yes">
    
      <xsl:output method="text"/>
    
      <xsl:template match="gSubAssetType">
        <xsl:result-document href="SubAssScno.txt">
          <xsl:apply-templates select=".//SubAssScno"/>
        </xsl:result-document>
        <xsl:result-document href="sTest.txt">
          <xsl:apply-templates select=".//sTest/Campo1"/>
        </xsl:result-document>
      </xsl:template>
      
      <xsl:template match="SubAssScno">
        <xsl:text>{local-name(../../..)}_{local-name(../..)},</xsl:text>
        <xsl:number count="mSubAssetType | sSubAssCou" level="multiple" format="1,1"/>
        <xsl:text>,{.}&#10;</xsl:text>
      </xsl:template>
    
      <xsl:template match="Campo1">
        <xsl:text>{local-name(../../..)}_{local-name(../..)},</xsl:text>
        <xsl:number count="mSubAssetType | sTest" level="multiple" format="1,1"/>
        <xsl:text>,{.}&#10;</xsl:text>
      </xsl:template> 
    
    </xsl:stylesheet>
    

    Online fiddle running Saxon HE 12.5 Java with CheerpJ 3 in the browser.

    After you have edited your question with a changed code sample and some XSLT sample code I think, if the element names are not known but it is all about the certain prefixes in element names that you want code like

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      version="3.0"
      xmlns:xs="http://www.w3.org/2001/XMLSchema"
      exclude-result-prefixes="#all"
      expand-text="yes">
    
      <xsl:output method="text"/>
    
      <xsl:template match="rows">
          <xsl:for-each-group select="row/*[starts-with(name(),'g')]/*[starts-with(name(),'m')]/*[starts-with(name(),'sg')]" composite="yes" group-by="substring(name(../..), 2), substring(name(), 3)">
            <xsl:result-document href="t_{current-grouping-key()[1]}_{current-grouping-key()[2]}.txt">
              <xsl:variable name="outer-keys" select="current-grouping-key()"/>
              <xsl:for-each-group select="current-group()/*[starts-with(name(),'s')]" group-by="node-name()">
                <xsl:for-each select="current-group()">
                  <xsl:variable name="pos" as="xs:string">
                    <xsl:number level="multiple" format="1;1" count="*[starts-with(name(),'m')] | *[matches(name(),'^s[^g]')]"/>
                  </xsl:variable>
                  <xsl:value-of select="string-join($outer-keys, '_'), $pos, *" separator=";"/>
                  <xsl:text>&#10;</xsl:text>
                </xsl:for-each>
              </xsl:for-each-group>
            </xsl:result-document>
          </xsl:for-each-group>
      </xsl:template>
    
    </xsl:stylesheet>
    

    Sample fiddle.