Search code examples
xmlxpathxqueryxslt-2.0

How to grab the index numbers and wrap under the element 'locator'


Here we are try to extract the index numbers and moved into the new element 'locator'. See the below example:

INPUT XML:

<?xml version="1.0" encoding="UTF-8"?>
<index>
<h1>Tax consequences of abandoning trade secret, 15.44&#x2013;15.45</h1>
<h2>Licensing agreement, address provision in, 8.34&#x0192;</h2>
<h3>Secretary of State&#x2019;s office, 9.13</h3>
<h2>Punitive and compensatory damages under federal constitutional maximum, 1.5-to-one ratio between, 12.22</h2>
<h4>Secretary of State&#x2019;s office, 19.13</h4>
<h5>Resolving ambiguities in insurance policy against insurer, 14.3</h5>
<h6>Bad faith lawsuit against competitor, antitrust consequences of, 11.2, 11.81</h6>
<h2>Consent to assignment, 8.43A&#x0192;, 9.10</h2>
<h2>Crime or fraud in misappropriation of trade secret, waiver of attorney-client privilege in cases of, 11.101</h2>
<h3>Representing clients in same field of technology, 17.10A</h3>
<h2>CCP &#x00a7;425.16, anti-SLAPP motions under, 11.29</h2>
<h3>CCP &#x00a7;2019.210, 11.44, 11.51, 11.53</h3>                    
</index>

EXPECTED OUTPUT:

<?xml version="1.0" encoding="UTF-8"?>
<index>
<h1>Tax consequences of abandoning trade secret, <locator>15.44</locator>&#x2013;<locator>15.45</locator></h1>
<h2>Licensing agreement, address provision in, <locator>8.34&#x0192;</locator></h2>
<h3>Secretary of State&#x2019;s office, <locator>9.13</locator></h3>
<h2>Punitive and compensatory damages under federal constitutional maximum, 1.5-to-one ratio between, <locator>12.22</locator></h2>
<h4>Secretary of State&#x2019;s office, <locator>19.13</locator></h4>
<h5>Resolving ambiguities in insurance policy against insurer, <locator>14.3</locator></h5>
<h6>Bad faith lawsuit against competitor, antitrust consequences of, <locator>11.2</locator>, <locator>11.81</locator></h6>
<h2>Consent to assignment, <locator>8.43A&#x0192;</locator>, <locator>9.10</locator></h2>
<h2>Crime or fraud in misappropriation of trade secret, waiver of attorney-client privilege in cases of, <locator>11.101</locator></h2>
<h3>Representing clients in same field of technology, <locator>17.10A</locator></h3>
<h2>CCP &#x00a7;425.16, anti-SLAPP motions under, <locator>11.29</locator></h2>
<h3>CCP &#x00a7;2019.210, <locator>11.44</locator>, <locator>11.51</locator>, <locator>11.53</locator></h3>                    
</index>

XSLT CODE:

<xsl:template match="//text()">
    <xsl:analyze-string select="." regex="((([0-9]+)([A-Z])?)\.([0-9A-Zƒ]+))">
        <xsl:matching-substring>
            <xsl:choose>
                <xsl:when test="regex-group(1)">
                    <xsl:value-of select="replace(., '(([0-9A-Z]+)\.([0-9A-Zƒ]+))', '&lt;locator&gt;$2.$3&lt;/locator&gt;')" disable-output-escaping="yes"/>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:value-of select="."/>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:matching-substring>
        <xsl:non-matching-substring>
            <xsl:value-of select="."/>
        </xsl:non-matching-substring>
    </xsl:analyze-string>
</xsl:template>

Reference URL : https://xsltfiddle.liberty-development.net/93nwgDD/1


Solution

  • Try this

    <xsl:template match="//text()">
        <xsl:analyze-string select="." regex="(, )([0-9][^a-z]+$)">
            <xsl:matching-substring>
                <xsl:value-of select="regex-group(1)"/>
                <xsl:analyze-string select="regex-group(2)" regex="(, |–)">
                    <xsl:matching-substring>
                        <xsl:value-of select="."/>
                    </xsl:matching-substring>
                    <xsl:non-matching-substring>
                        <locator><xsl:value-of select="."/></locator>
                    </xsl:non-matching-substring>
                </xsl:analyze-string>
            </xsl:matching-substring>
            <xsl:non-matching-substring>
                <xsl:value-of select="."/>
            </xsl:non-matching-substring>
        </xsl:analyze-string>
    </xsl:template>
    

    See Transformation at https://xsltfiddle.liberty-development.net/3MEcZxA