Search code examples
regexxslt-2.0xslt-3.0xslkey

regex-group(last()) in XSLT


Working on linking of some text in input with analyze-string but not able to retrive regex-group(last()) in script.

You can check transform at https://xsltfiddle.liberty-development.net/bnnZVG

As you see, current output

<?xml version="1.0" encoding="UTF-8"?>
<TEST>
    <P>Check <Link ID="ID0001">AbC,2013</Link>AbC,2013Marking</P>
    <P>Check <Link ID="ID0001">ABc, 2013</Link>ABc, 2013Marking</P>
    <P>Check <Link ID="ID0001">ABC 2013</Link>ABC 2013Marking</P>
    <P>Check <Link ID="ID0001">ABC</Link>ABCMarking</P>
    <P>Check <Link ID="ID0002">BCA,2013</Link>BCA,2013Marking</P>
    <P>Check <Link ID="ID0002">bcA, 2013</Link>bcA, 2013Marking</P>
    <P>Check <Link ID="ID0002">BCa 2013</Link>BCa 2013Marking</P>
    <P>Check <Link ID="ID0002">bcA</Link>bcAMarking</P>
</TEST>

but expected output is

<?xml version="1.0" encoding="UTF-8"?>
<TEST>
    <P>Check <Link ID="ID0001">AbC,2013</Link> Marking</P>
    <P>Check <Link ID="ID0001">ABc, 2013</Link> Marking</P>
    <P>Check <Link ID="ID0001">ABC 2013</Link> Marking</P>
    <P>Check <Link ID="ID0001">ABC</Link> Marking</P>
    <P>Check <Link ID="ID0002">BCA,2013</Link> Marking</P>
    <P>Check <Link ID="ID0002">bcA, 2013</Link> Marking</P>
    <P>Check <Link ID="ID0002">BCa 2013</Link> Marking</P>
    <P>Check <Link ID="ID0002">bcA</Link> Marking</P>
</TEST>

Thanks in Advance


Solution

  • Which value do you expect from calling last() inside of xsl:analyze-string? If you look at the last paragraph in https://www.w3.org/TR/xslt-30/#element-analyze-string it says:

    A matching substring is processed using the xsl:matching-substring element, a non-matching substring using the xsl:non-matching-substring element. Each of these elements takes a sequence constructor as its content. If the element is absent, the effect is the same as if it were present with empty content. In processing each substring, the contents of the substring will be the context item (as a value of type xs:string); the position of the substring within the sequence of matching and non-matching substrings will be the context position; and the number of matching and non-matching substrings will be the context size.

    So as last() returns the context size it should be equal to the number of matching and non-matching substrings.

    I realize this is not quite an answer but it is too long to be used as a comment. You might also want to edit your question and tell us what the XSLT you have linked to is supposed to achieve in plain words, then we might be able to help suggest an appropriate XSLT solution.

    Also note that XSLT 3 with XPath 3 has an analyze-string function which returns an XML structure with the matches and groups so processing/consuming that might help you to extract the contents you want:

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema"
        xmlns:fn="http://www.w3.org/2005/xpath-functions"
        exclude-result-prefixes="#all"
        version="3.0">
    
        <xsl:mode on-no-match="shallow-copy"/>
    
        <xsl:key name="testkey" match="link" use="linkname"/>
    
        <xsl:variable name="testcheck">
            <link name="ID0001">
                <linkname>abc, 2013</linkname>
                <linkname>abc</linkname>
            </link>
            <link name="ID0002">
                <linkname>bca, 2013</linkname>
                <linkname>bca</linkname>
            </link>
        </xsl:variable>
    
        <xsl:variable name="test">
            <xsl:text>(</xsl:text>
            <xsl:for-each select="$testcheck//linkname">
                <xsl:if test="position() ne 1"><xsl:text>|</xsl:text></xsl:if>
                <xsl:value-of select="."/>
            </xsl:for-each>
            <xsl:text>)</xsl:text>
        </xsl:variable>
    
        <xsl:variable name="regex" as="xs:string"
          select="concat('(^|\W)', replace($test, ', ([0-9][0-9][0-9][0-9])', '(, $1|,$1| $1)?'), '($|\W)')"/>
    
        <xsl:template match="text()[matches(., $regex, 'i')]">
             <xsl:apply-templates select="analyze-string(., $regex, 'i')" mode="extract"/>
        </xsl:template>
    
        <xsl:mode name="extract" on-no-match="text-only-copy"/>
    
        <xsl:template match="fn:match/fn:group[@nr = 2]" mode="extract">
            <Link ID="{$testcheck//key('testkey', lower-case(replace(current(), '(, |,| )([0-9][0-9][0-9][0-9])', ', $2')))/@name}">
                <xsl:value-of select="."/>
            </Link>          
        </xsl:template>
    
    </xsl:stylesheet>
    

    https://xsltfiddle.liberty-development.net/bnnZVG/2 gives

    <TEST>
        <P>Check <Link ID="ID0001">AbC,2013</Link> Marking</P>
        <P>Check <Link ID="ID0001">ABc, 2013</Link> Marking</P>
        <P>Check <Link ID="ID0001">ABC 2013</Link> Marking</P>
        <P>Check <Link ID="ID0001">ABC</Link> Marking</P>
        <P>Check <Link ID="ID0002">BCA,2013</Link> Marking</P>
        <P>Check <Link ID="ID0002">bcA, 2013</Link> Marking</P>
        <P>Check <Link ID="ID0002">BCa 2013</Link> Marking</P>
        <P>Check <Link ID="ID0002">bcA</Link> Marking</P>
    </TEST>