Search code examples
xsltgroupingxslt-2.0xsl-grouping

Using grouping to pull together text and then test it


So in this grotty extruded typesetting product, I sometimes see links and email addresses that have been split apart. Example:

<p>Here is some random text with an email address 
<Link>example</Link><Link>@example.com</Link> and here 
is more random text with a url 
<Link>http://www.</Link><Link>example.com</Link> near the end of the sentence.</p>

Desired output:

<p>Here is some random text with an email address 
<email>[email protected]</email> and here is more random text 
with a url <ext-link ext-link-type="uri" xlink:href="http://www.example.com/">
http://www.example.com/</ext-link> near the end of the sentence.</p>

Whitespace between the elements does not appear to occur, which is one blessing.

I can tell I need to use an xsl:for-each-group within the p template, but I can't quite see how to put the combined text from the group through the contains() function so as to distinguish emails from URLs. Help?


Solution

  • If you use group-adjacent then you can simply string-join the current-group() as in

    <xsl:stylesheet
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      xmlns:xlink="http://www.w3.org/1999/xlink"
      xmlns:xsd="http://www.w3.org/2001/XMLSchema"
      exclude-result-prefixes="xsd"
      version="2.0">
    
      <xsl:template match="p">
        <xsl:copy>
          <xsl:for-each-group select="node()" group-adjacent="boolean(self::Link)">
            <xsl:choose>
              <xsl:when test="current-grouping-key()">
                <xsl:variable name="link-text" as="xsd:string" select="string-join(current-group(), '')"/>
                <xsl:choose>
                  <xsl:when test="matches($link-text, '^https?://')">
                    <ext-link ext-link-type="uri" xlink:href="{$link-text}">
                      <xsl:value-of select="$link-text"/>
                    </ext-link>
                  </xsl:when>
                  <xsl:otherwise>
                    <email><xsl:value-of select="$link-text"/></email>
                  </xsl:otherwise>
                </xsl:choose>
              </xsl:when>
              <xsl:otherwise>
                <xsl:apply-templates select="current-group()"/>
              </xsl:otherwise>
            </xsl:choose>
          </xsl:for-each-group>
        </xsl:copy>
      </xsl:template>
    
    </xsl:stylesheet>