Search code examples
xmlxslttransformdita

XSL split string on specific characters and apply processing (Example From DITA Open Toolkit Processing)


I have this rule in my XSL already:

  <xsl:template match="title[starts-with(.,'Key Information for')]">
    <xsl:copy>
      <xsl:attribute name="outputclass">Key Information Long</xsl:attribute>
      <xsl:apply-templates select="@*  | node()"/>
    </xsl:copy>
  </xsl:template>  

What it does: add outputclass='Key Information' to title tags that start with "Key Information for" during pre-processing in the DITA Open Toolkit.

What I need to do is add an XPath to a match rule that finds other titles that fit this pattern:

<title>Some title text (More title text)</title>

My xsl xpath needs to identify the titles that end in text contained in parentheses and add some new tagging (<ph aid:cstyle="parenthetical-subtitle">(More title text)</ph> to that piece of the title. When I add this rule, I also need to know how to integrate with the existing rule so it too gets the additional processing (and avoids an ambiguity conflict: both the existing rule and the new one could apply to the same instance of content).

I was imagining an edit involving choose/when logic with the ends-with() function, but I hit the wall on exactly how to do this, and it turns out the toolkit version I am using only supports xml 1.0. To illustrate --

If I have this input xml:

<section>
<title>Key Information for Our Test Process</title>
<p>Some stupid content.</p>
</section>
<section>
<title>Some Other Title</title>
<p>Other content.</p>
</section>
<section>
<title>Key Information for Testing (Stupid People Only)</title>
<p>Parentheses identify an audience which could really be anything.</p>
</section>

I want my edit to my rules to produce this output:

<section>
<title outputclass="Key Information Long">Key Information for Our Test Process</title>
<p>Some stupid content.</p>
</section>
<section>
<title>Some Other Title</title>
<p>Other content.</p>
</section>
<section>
<title outputclass="Key Information Long">Key Information for Testing <ph aid:cstyle="parenthetical-subtitle">(Stupid People Only)</ph></title>
<p>Parentheses identify an audience which could really be anything.</p>
</section>

Solution

  • If we can assume that the title element has no element children but solely plain text and you have an XSLT 2 or 3 processor as mentioning ends-with suggests you can match on the text node in the element and process it with xsl:analyze-string:

      <xsl:template match="title[starts-with(.,'Key Information for')]">
        <xsl:copy>
          <xsl:attribute name="outputclass">Key Information Long</xsl:attribute>
          <xsl:apply-templates select="@*  | node()"/>
        </xsl:copy>
      </xsl:template>
    
      <xsl:template match="title[starts-with(.,'Key Information for')]/text()">
          <xsl:analyze-string select="." regex="\(.*\)$">
              <xsl:matching-substring>
                  <ph aid:cstyle="parenthetical-subtitle">
                      <xsl:value-of select="."/>
                  </ph>
              </xsl:matching-substring>
              <xsl:non-matching-substring>
                  <xsl:value-of select="."/>
              </xsl:non-matching-substring>
          </xsl:analyze-string>
      </xsl:template>
    

    https://xsltfiddle.liberty-development.net/6pS2B6n