Search code examples
c#regexlettertrailing

Regex that removes the 2 trailing letters from a string not preceded with other letters


This is in C#. I've been bugging my head but not luck so far.

So for example

123456BVC --> 123456BVC (keep the same)
123456BV --> 123456 (remove trailing letters) 
12345V -- > 12345V (keep the same)
12345 --> 12345 (keep the same)
ABC123AB --> ABC123 (remove trailing letters) 

It can start with anything.

I've tried @".*[a-zA-Z]{2}$" but no luck

This is in C# so that I always return a string removing the two trailing letters if they do exist and are not preceded with another letter.

Match result = Regex.Match(mystring, pattern);
return result.Value;

Solution

  • Your @".*[a-zA-Z]{2}$" regex matches any 0+ characters other than a newline (as many as possible) and 2 ASCII letters at the end of the string. You do not check the context, so the 2 letters are matched regardless of what comes before them.

    You need a regex that will match the last two letters not preceded with a letter:

    (?<!\p{L})\p{L}{2}$
    

    See this regex demo.

    Details:

    • (?<!\p{L}) - fails the match if a letter (\p{L}) is found before the current position (you may use [a-zA-Z] if you only want to deal with ASCII letters)
    • \p{L}{2} - 2 letters
    • $ - end of string.

    In C#, use

    var result = Regex.Replace(mystring, @"(?<!\p{L})\p{L}{2}$", string.Empty);