Search code examples
regex

Matching any number of tokens in any order


I have files that can be named in any of these formats: abc, abcX, abcY, abcXY and abcYX.
Where X and Y are strings in a specific patter and 'abc' can be any string except X or Y.

I am trying to extract just 'abc' for all of the possible formats.

This is what I am currently using.

Match match;
While ((match = Regex.Match(fileName, '(.*)(X|Y)')).Success)
    fileName = match.Groups[1].Value;

These are some of what I have tried.

(.*)(X|Y), (.*)(X|Y|XY|YX) and (.*)(X|X?Y) which all fail on abcXY and abcYX, returning the groups (abcXY)(abcX)(Y).

And every way I try using a Positive Lookahead fails on abcX and abcY. e.g. ^(?=(.*)x)(?=(.*)Y).*$ matches only abcXY and abcYX, returning the groups (abcXY)(ABC)(abcY).

What I have works, but I would like a solution that avoids the while loop.


Solution

  • Wiktor Stribiżew's answer in the comments solved the issue:

    ^(.*?)(Y?X|X?Y)?$ - see regex101.com/r/TQCzDC/1