Search code examples
xsltrecursionsubstringtail-recursion

XSLT multiple string replacement with recursion


I have been attempting to perform multiple (different) string replacement with recursion and I have hit a roadblock. I have sucessfully gotten the first replacement to work, but the subsequent replacements never fire. I know this has to do with the recursion and how the with-param string is passed back into the call-template. I see my error and why the next xsl:when never fires, but I just cant seem to figure out exactly how to pass the complete modified string from the first xsl:when to the second xsl:when. Any help is greatly appreciated.

<xsl:template name="replace">
    <xsl:param name="string" select="." />
    <xsl:choose>
        <xsl:when test="contains($string, '&#13;&#10;')">
            <xsl:value-of select="substring-before($string, '&#13;&#10;')" />
            <br/>
            <xsl:call-template name="replace">
                <xsl:with-param name="string" select="substring-after($string, '&#13;&#10;')"/>
            </xsl:call-template>
        </xsl:when>
        <xsl:when test="contains($string, 'TXT')">
            <xsl:value-of select="substring-before($string, '&#13;TXT')" />
            <xsl:call-template name="replace">
                <xsl:with-param name="string" select="substring-after($string, '&#13;')" />
            </xsl:call-template>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="$string"/>
        </xsl:otherwise>

    </xsl:choose>
</xsl:template>

Solution

  • This transformation is fully parameterized and doesn't need any tricks with default namespaces:

    <xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:my="my:my">
     <xsl:output omit-xml-declaration="yes" indent="yes"/>
     <xsl:strip-space elements="*"/>
    
     <my:params xml:space="preserve">
      <pattern>
       <old>&#xA;</old>
       <new><br/></new>
      </pattern>
      <pattern>
       <old>quick</old>
       <new>slow</new>
      </pattern>
      <pattern>
       <old>fox</old>
       <new>elephant</new>
      </pattern>
      <pattern>
       <old>brown</old>
       <new>white</new>
      </pattern>
     </my:params>
    
     <xsl:variable name="vPats"
          select="document('')/*/my:params/*"/>
    
     <xsl:template match="text()" name="multiReplace">
      <xsl:param name="pText" select="."/>
      <xsl:param name="pPatterns" select="$vPats"/>
    
      <xsl:if test=
       "string-length($pText) >0">
    
        <xsl:variable name="vPat" select=
         "$vPats[starts-with($pText, old)][1]"/>
        <xsl:choose>
         <xsl:when test="not($vPat)">
           <xsl:copy-of select="substring($pText,1,1)"/>
         </xsl:when>
         <xsl:otherwise>
           <xsl:copy-of select="$vPat/new/node()"/>
         </xsl:otherwise>
        </xsl:choose>
    
        <xsl:call-template name="multiReplace">
          <xsl:with-param name="pText" select=
           "substring($pText, 1 + not($vPat) + string-length($vPat/old/node()))"/>
        </xsl:call-template>
      </xsl:if>
     </xsl:template>
    </xsl:stylesheet>
    

    when it is applied on this XML document:

    <t>The quick
    brown fox</t>
    

    the wanted, correct result is produced:

    The slow<br/>white elephant
    

    Explanation:

    The text is scanned from left to right and at any position, if the remaining string starts with one of the specified patterns, then the starting substring is replaced by the replacement specified for the firat matching patterns.

    Do note: If we have search patterns:

       "relation"   --> "mapping" 
       "corelation" --> "similarity"
    

    in the above order, and text:

       "corelation"
    

    then this solution produces the more correct result:

    "similarity"
    

    and the currently accepted solution by @Alejandro) produces:

    "comapping"
    

    Edit: With a small update we get another improvement: If at a given location more than one replace is possible, we perform the longest replace.

    <xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:ext="http://exslt.org/common"
     xmlns:my="my:my">
        <xsl:output omit-xml-declaration="yes"/>
        <xsl:strip-space elements="*"/>
    
        <my:params xml:space="preserve">
            <pattern>
                <old>&#xA;</old>
                <new><br/></new>
            </pattern>
            <pattern>
                <old>quick</old>
                <new>slow</new>
            </pattern>
            <pattern>
                <old>fox</old>
                <new>elephant</new>
            </pattern>
            <pattern>
                <old>brown</old>
                <new>white</new>
            </pattern>
        </my:params>
    
        <xsl:variable name="vrtfPats">
         <xsl:for-each select="document('')/*/my:params/*">
          <xsl:sort select="string-length(old)"
               data-type="number" order="descending"/>
           <xsl:copy-of select="."/>
         </xsl:for-each>
        </xsl:variable>
    
        <xsl:variable name="vPats" select=
         "ext:node-set($vrtfPats)/*"/>
    
        <xsl:template match="text()" name="multiReplace">
            <xsl:param name="pText" select="."/>
            <xsl:param name="pPatterns" select="$vPats"/>
            <xsl:if test=    "string-length($pText) >0">      
                <xsl:variable name="vPat" select=
                "$vPats[starts-with($pText, old)][1]"/>
    
                <xsl:choose>
                    <xsl:when test="not($vPat)">
                        <xsl:copy-of select="substring($pText,1,1)"/>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:copy-of select="$vPat/new/node()"/>
                    </xsl:otherwise>
                </xsl:choose>
    
                <xsl:call-template name="multiReplace">
                    <xsl:with-param name="pText" select=
                    "substring($pText,
                              1 + not($vPat) + string-length($vPat/old/node())
                              )"/>
                </xsl:call-template>
            </xsl:if>
        </xsl:template>
    </xsl:stylesheet>
    

    Thus, if we have two reps such as "core" --> "kernel" and "corelation" --> "similarity", The second would be used for a text containing the word "corelation", regardless of how the reps are ordered.