Search code examples
xmlxsltstringexslt

"Unrolling" a string with XSL


We have an application which has a heirarchical group structure. Some of the groups are passed in this format:

/Geography/NA/US/California

I would like to "unroll" this string so that I can get a node set like the following:

/Geography
/Geography/NA
/Geography/NA/US
/Geography/NA/US/California

I know I can use str:tokenize and get a nodeset like so: [Geography, NA, US, California] but I'm at a loss of how to assemble the parts back together incrementally.

I have most of the exslt functions available to use, but no XSLT 2.0 functions.

Any help appreciated!


Solution

  • This is quite easy in plain XSLT 1.0, all you need is a recursive function like so:

    <xsl:template name="UnrollString">
      <xsl:param name="string" select="''" />
      <xsl:param name="head"   select="'/'" />
    
      <xsl:variable name="tail" select="
        concat(
          substring-after($string, $head), 
          '/'
        )
      " />
      <xsl:variable name="lead" select="
        concat(
          $head, 
          substring-before($tail, '/')
        )
      " />
    
      <xsl:if test="not($tail = '/')">
        <token>
          <xsl:value-of select="$lead" />
        </token>
    
        <xsl:call-template name="UnrollString">
          <xsl:with-param name="string" select="$string" />
          <xsl:with-param name="head"   select="concat($lead, '/')" />
        </xsl:call-template>
      </xsl:if>
    </xsl:template>
    

    Output for '/Geography/NA/US/California' is:

    <token>/Geography</token>
    <token>/Geography/NA</token>
    <token>/Geography/NA/US</token>
    <token>/Geography/NA/US/California</token>
    

    Note that:

    • The function expects the string to start with a delimiter (i.e. a slash), or the first word ('Geography') will be missing in the output.
    • A single trailing slash is ignored.
    • The delimiter cold easily be generalized and passed in as a parameter.
    • You could build a hierarchy easily by placing the recursive call into the <token> element instead of outside.
    • The output order can be reversed (longest to shortest) by placing the recursive call above the <token> element instead of below it.
    • You would need to use the node-set() extension function to convert the returned tokens into something that can be used further.