Search code examples
xsltcopyxslt-1.0tokenizeapply-templates

Duplicate node() for every token in grandchild and replace element text of the grandchild by token in XSLT 1.0?


My (simplified) input XML looks like this:

<?xml version="1.0" encoding="utf-8"?>
<root>
  <recordList>

    <record>
      <id>16</id>

      <MaterialGroup>
        <material>
          <term>metal, glass</term>
        </material>
        <material.notes />
        <material.part>body</material.part>
      </MaterialGroup>

      <MaterialGroup>
        <material>
          <term>wood</term>
        </material>
        <material.notes>fragile</material.notes>
        <material.part>lid</material.part>
      </MaterialGroup>
    </record>

    <record>
      ...
    </record>

  </recordList>
</root>

Note that term may contain a comma-separated list of multiple materials (metal, glass).

Desired output:

I want to split the material/term and need to duplicate the grandparent Material with all attributes and nodes for that.

<?xml version="1.0" encoding="utf-8"?>
...
      <MaterialGroup>
        <material>
          <term>metal</term>
        </material>
        <material.notes />
        <material.part>body</material.part>
      </MaterialGroup>

      <MaterialGroup>
        <material>
          <term>glass</term>
        </material>
        <material.notes />
        <material.part>body</material.part>
      </MaterialGroup>

      <MaterialGroup>
        <material>
          <term>wood</term>
        </material>
        <material.notes>fragile</material.notes>
        <material.part>lid</material.part>
      </MaterialGroup>
    </record>
...

The first MaterialGroup is copied for every token in the delimited grandchild element material/term, and the term text is set to the token text. material.parts and material.notes can be copied unchanged.

My stylesheet:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
  <xsl:output method="xml" indent="yes"/>
  <xsl:variable name="separator" select="','"/>

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>


  <xsl:template match="material/term" mode="s">
    <xsl:param name="split_term"/>
    <xsl:value-of select="$split_term"/>
  </xsl:template>


  <xsl:template match="MaterialGroup" name="tokenize">
    <xsl:param name="text" select="material/term"/>

    <xsl:choose>
      <xsl:when test="not(contains($text, $separator))">
        <xsl:copy>
          <xsl:apply-templates/>
          <xsl:apply-templates select="material/term" mode="s">
            <xsl:with-param name="split_term">
              <xsl:value-of select="normalize-space($text)"/>
            </xsl:with-param>
          </xsl:apply-templates>

        </xsl:copy>

      </xsl:when>

      <xsl:otherwise>
        <xsl:copy>

          <xsl:apply-templates/>
          <xsl:apply-templates select="material/term" mode="s">
            <xsl:with-param name="split_term">
              <xsl:value-of select="normalize-space(substring-before($text, $separator))"/>
            </xsl:with-param>
          </xsl:apply-templates>

        </xsl:copy>

        <xsl:call-template name="tokenize">
          <xsl:with-param name="text" select="substring-after($text, $separator)"/>
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>

  </xsl:template>

</xsl:stylesheet>

Actual output:

<?xml version="1.0" encoding="utf-8"?>
<root>
  <recordList>

    <record>
      <id>16</id>

      <MaterialGroup>
        <material>
          <term>metal, glass</term>
        </material>
        <material.notes />
        <material.part>body</material.part>
        metal
      </MaterialGroup>

      <MaterialGroup>
        <material>
          <term>metal, glass</term>
        </material>
        <material.notes />
        <material.part>body</material.part>
        glass
      </MaterialGroup>

      <MaterialGroup>
        <material>
          <term>wood</term>
        </material>
        <material.notes>fragile</material.notes>
        <material.part>lid</material.part>
        wood
      </MaterialGroup>
    </record>

    <record>
      ...
    </record>
  </recordList>
</root>

The tokens (metal, glass) occur as text elements as MaterialGroup children, below material.parts. The text element where it should actually appear (material/term) is unchanged.

I looked at couple solutions to similar problems, but no success:

https://stackoverflow.com/a/5480198/2044940
https://stackoverflow.com/a/10430719/2044940
http://codesequoia.wordpress.com/2012/02/15/xslt-example-add-a-new-node-to-elements/
...

Any ideas?


Edit: Solution by Martin, without modes as suggested by michael:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
  <xsl:output method="xml" indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:param name="separator" select="', '"/>

  <xsl:template match="@* | node()">
    <xsl:param name="term"/>
    <xsl:copy>
      <xsl:apply-templates select="@* | node()">
        <xsl:with-param name="term" select="$term"/>
      </xsl:apply-templates>
    </xsl:copy>
  </xsl:template> 


  <xsl:template match="material/term">
    <xsl:param name="term"/>
    <xsl:copy>
      <xsl:value-of select="$term"/>
    </xsl:copy>
  </xsl:template>


  <xsl:template match="MaterialGroup" name="tokenize">
    <xsl:param name="text" select="material/term"/>

    <xsl:choose>
      <xsl:when test="not(contains($text, $separator))">
        <xsl:copy>
          <xsl:apply-templates>
            <xsl:with-param name="term" select="$text"/>
          </xsl:apply-templates>
        </xsl:copy> 
      </xsl:when>

      <xsl:otherwise> 
        <xsl:copy>
          <xsl:apply-templates>
            <xsl:with-param name="term" select="substring-before($text, $separator)"/>
          </xsl:apply-templates>
        </xsl:copy> 

        <xsl:call-template name="tokenize">
          <xsl:with-param name="text" select="substring-after($text, $separator)"/>
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>

  </xsl:template>

</xsl:stylesheet>

Solution

  • I think you need to pass your term around:

    <?xml version="1.0" encoding="utf-8"?>
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
    >
      <xsl:output method="xml" indent="yes"/>
      <xsl:strip-space elements="*"/>
    
      <xsl:param name="separator" select="', '"/>
    
      <xsl:template match="@* | node()">
        <xsl:copy>
          <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
      </xsl:template>
    
      <xsl:template match="@* | node()" mode="s">
        <xsl:param name="term"/>
        <xsl:copy>
          <xsl:apply-templates select="@* | node()" mode="s">
            <xsl:with-param name="term" select="$term"/>
          </xsl:apply-templates>
        </xsl:copy>
      </xsl:template>
    
      <xsl:template match="material/term" mode="s">
        <xsl:param name="term"/>
        <xsl:copy>
          <xsl:value-of select="$term"/>
        </xsl:copy>
      </xsl:template>
    
    
      <xsl:template match="MaterialGroup" name="tokenize">
        <xsl:param name="text" select="material/term"/>
    
        <xsl:choose>
          <xsl:when test="not(contains($text, $separator))">
            <xsl:copy>
              <xsl:apply-templates mode="s">
                <xsl:with-param name="term" select="$text"/>
              </xsl:apply-templates>
            </xsl:copy>
    
          </xsl:when>
    
          <xsl:otherwise>
    
            <xsl:copy>
              <xsl:apply-templates mode="s">
                <xsl:with-param name="term" select="substring-before($text, $separator)"/>
              </xsl:apply-templates>
            </xsl:copy>
    
    
            <xsl:call-template name="tokenize">
              <xsl:with-param name="text" select="substring-after($text, $separator)"/>
            </xsl:call-template>
          </xsl:otherwise>
        </xsl:choose>
    
      </xsl:template>
    
    </xsl:stylesheet>
    

    That way, with your input, I get

    <root>
       <recordList>
          <record>
             <id>16</id>
             <MaterialGroup>
                <material>
                   <term>metal</term>
                </material>
                <material.notes/>
                <material.part>body</material.part>
             </MaterialGroup>
             <MaterialGroup>
                <material>
                   <term>glass</term>
                </material>
                <material.notes/>
                <material.part>body</material.part>
             </MaterialGroup>
             <MaterialGroup>
                <material>
                   <term>wood</term>
                </material>
                <material.notes>fragile</material.notes>
                <material.part>lid</material.part>
             </MaterialGroup>
          </record>
          <record>
          ...
        </record>
       </recordList>
    </root>