I need to sort a set of documents using attribute 'd' in ascending order. But in this attributes numbers mixed with letters.
Attributes could be like: d="11A-1-000003" d="11-1-000008a" d="11-16-000009" d="11-1C-000008" d="11-9-000002" d="12-1-000008a" d="11-15-00014" d="13-1-000007a" d="11-15B-00014a" d="11-24-00043a" d="11-3-000023" d="11-3-000023a" d="11-3-000023b"
I tried different solutions, but have no luck, order is not correct.
<xsl:sort select="normalize-space(@d)" data-type="text" order="ascending" case-order="upper-first"/>
<xsl:sort select="replace(normalize-space(@d), '[^\d]', '')" data-type="number" order="ascending"/>
<xsl:sort select="substring-before(normalize-space(@d), '-')" data-type="number"/>
<xsl:sort select="substring-before(substring-after(normalize-space(@d), '-'), '-')" data-type="number"/>
<xsl:sort select="substring-after(substring-after(normalize-space(@d), '-'), '-')" data-type="number"/>
<xsl:sort select="substring-before(normalize-space(@d), '-')" data-type="text"/>
<xsl:sort select="substring-before(substring-after(normalize-space(@d), '-'), '-')" data-type="text"/>
<xsl:sort select="substring-after(substring-after(normalize-space(@d), '-'), '-')" data-type="text"/>
<xsl:sort select="number(tokenize(@d, '-')[1])" data-type="number"/>
<xsl:sort select="number(tokenize(@d, '-')[2])" data-type="number"/>
<xsl:sort select="number(tokenize(@d, '-')[3])" data-type="number"/>
<xsl:sort select="tokenize(normalize-space(@d), '-')[1]"/>
<xsl:sort select="tokenize(normalize-space(@d), '-')[2]"/>
<xsl:sort select="tokenize(normalize-space(@d), '-')[3]"/>
Actual result is that: 11-3-000023 is after 11-24-00043a but should be after 11-2, 11-1C-000008 is after 11-15B-000008 but should be after 11-1
The expected result is that numbers should have numbers should take precedence over letters. Numbers are chapters, letters are subchapters.
As an example expected result is:
d="11-1-000008a" d="11-1C-000008" d="11-3-000023" d="11-3-000023a" d="11-3-000023b" d="11-9-000002" d="11-15-00014" d="11-15B-00014a" d="11-16-000009" d="11-24-00043a" d="11A-1-000003" d="12-1-000008a" d="13-1-000007a"
Here's a solution which uses a regular expression to parse the @d
values into six separate tokens (numeric, and non-numeric), and sorts the documents with a separate xsl:sort
for each one of those tokens.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="true"/>
<xsl:template match="documents">
<xsl:copy>
<!-- regular expression parses @d values into 6 tokens:
a numeric token, an optional non-numeric token, an ignored hyphen,
a numeric token, an optional non-numeric token, an ignored hyphen,
a numeric token, an optional non-numeric token.
-->
<xsl:variable name="parser">(\d+)([^\d]*)-(\d+)([^\d]*)-(\d+)([^\d]*)</xsl:variable>
<xsl:perform-sort select="*">
<xsl:sort select="replace(@d, $parser, '$1')" data-type="number"/>
<xsl:sort select="replace(@d, $parser, '$2')"/>
<xsl:sort select="replace(@d, $parser, '$3')" data-type="number"/>
<xsl:sort select="replace(@d, $parser, '$4')"/>
<xsl:sort select="replace(@d, $parser, '$5')" data-type="number"/>
<xsl:sort select="replace(@d, $parser, '$6')"/>
</xsl:perform-sort>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Input:
<documents>
<document d="11A-1-000003"/>
<document d="11-1-000008a"/>
<document d="11-16-000009"/>
<document d="11-1C-000008"/>
<document d="11-9-000002"/>
<document d="12-1-000008a"/>
<document d="11-15-00014"/>
<document d="13-1-000007a"/>
<document d="11-15B-00014a"/>
<document d="11-24-00043a"/>
<document d="11-3-000023"/>
<document d="11-3-000023a"/>
<document d="11-3-000023b"/>
</documents>
Output
<documents>
<document d="11-1-000008a"/>
<document d="11-1C-000008"/>
<document d="11-3-000023"/>
<document d="11-3-000023a"/>
<document d="11-3-000023b"/>
<document d="11-9-000002"/>
<document d="11-15-00014"/>
<document d="11-15B-00014a"/>
<document d="11-16-000009"/>
<document d="11-24-00043a"/>
<document d="11A-1-000003"/>
<document d="12-1-000008a"/>
<document d="13-1-000007a"/>
</documents>