Search code examples
xmlsubstringxslt-2.0xslt-3.0

substring-before the last occurrence of any non-word character xslt 2.0 3.0


Similar to this question, but in XSLT 2.0 or 3.0, and I want to break on the last non-word character (like the regex \W)

Finding the last occurrence of a string xslt 1.0

The input is REMOVE-THIS-IS-A-TEST-LINE,XXX,XXXXX

The desired output is:

REMOVE-THIS-IS-A-TEST-LINE,XXX,
      

This works for one delimiter at a time, but I need to break on at least commas, spaces and dashes.

substring('REMOVE-THIS-IS-A-TEST-LINE,XXX,XXXXX',1,index-of(string-to-codepoints('REMOVE-THIS-IS-A-TEST-LINE,XXX,XXXXX'),string-to-codepoints(' '))[last()])

I am using oxygen with saxon 9.9EE and antenna house.


Solution

  • I would do

    replace('REMOVE-THIS-IS-A-TEST-LINE,XXX,XXXXX', '(\W)\w*$', '$1')
    

    However, this involves back-tracking, so it might be expensive if done on a long line. To avoid the backtracking, try

    string-join(
       analyze-string(
          'REMOVE-THIS-IS-A-TEST-LINE,XXX,XXXXX', '\W')
          /*[not(position()=last())])