I'm using Saxon-EE 11 and my platform's language is en-us
.
I'm attempting to implement custom sorting behavior for an <xsl:sort>
instruction by specifying a UCA collation. Ignoring the XML document details and just getting to the core, string-by-string comparison question, I want these strings:
ABSENTEES
ABSENTEE VOTING
MINNEAPOLIS TEACHERS RETIREMENT FUND ASSOCIATION (MTRFA)
MINNEAPOLIS-SAINT PAUL INTERNATIONAL AIRPORT
MINNEAPOLIS/SAINT PAUL HOUSING FINANCE BOARD
MINNEAPOLIS
MINNEAPOLIS PORT AUTHORITY
to be sorted into this order:
ABSENTEE VOTING
ABSENTEES
MINNEAPOLIS
MINNEAPOLIS PORT AUTHORITY
MINNEAPOLIS/SAINT PAUL HOUSING FINANCE BOARD
MINNEAPOLIS-SAINT PAUL INTERNATIONAL AIRPORT
MINNEAPOLIS TEACHERS RETIREMENT FUND ASSOCIATION (MTRFA)
Attempting to render the rules into English:
ABSENTEE VOTING
before ABSENTEES
)The UCA collation http://www.w3.org/2013/collation/UCA?alternate=shifted
handles the MINNEAPOLIS*
strings correctly, but it will put ABSENTEES
before ABSENTEE VOTING
.
The bare UCA collation http://www.w3.org/2013/collation/UCA
handles ABSENTEES
and ABSENTEE VOTING
correctly, but will place the MINNEAPOLIS/SAINT PAUL
and MINNEAPOLIS-SAINT PAUL
strings after anything with MINNEAPOLIS
and a space character.
I've attempted a few other combinations of parameters, though none of them has produced anything closer to what I'm looking for. I'm close to giving up and implementing either a custom pre-processing before applying the collation or else dropping into a Java implementation.
If what I'm looking for is truly not achievable with UCA collations, that's good to know.
Using an input of:
XML
<root>
<string>ABSENTEES</string>
<string>ABSENTEE VOTING</string>
<string>MINNEAPOLIS TEACHERS RETIREMENT FUND ASSOCIATION (MTRFA)</string>
<string>MINNEAPOLIS-SAINT PAUL INTERNATIONAL AIRPORT</string>
<string>MINNEAPOLIS/SAINT PAUL HOUSING FINANCE BOARD</string>
<string>MINNEAPOLIS</string>
<string>MINNEAPOLIS PORT AUTHORITY</string>
</root>
and the following stylesheet:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/root">
<output>
<xsl:perform-sort select="string">
<xsl:sort select="translate(., '-/', ' ')"/>
</xsl:perform-sort>
</output>
</xsl:template>
</xsl:stylesheet>
I get:
Result
<?xml version="1.0" encoding="UTF-8"?>
<output>
<string>ABSENTEE VOTING</string>
<string>ABSENTEES</string>
<string>MINNEAPOLIS</string>
<string>MINNEAPOLIS PORT AUTHORITY</string>
<string>MINNEAPOLIS/SAINT PAUL HOUSING FINANCE BOARD</string>
<string>MINNEAPOLIS-SAINT PAUL INTERNATIONAL AIRPORT</string>
<string>MINNEAPOLIS TEACHERS RETIREMENT FUND ASSOCIATION (MTRFA)</string>
</output>