Search code examples
pythonregexregex-lookarounds

How to "match if there is one occurrence but don't match if the pattern appears multiple times" in regular expression


I'm trying to match a string that takes the form XXXX-NNNO

X: Capital Character

N: Integer

O: Optional capital character

For example

ABCD-111The => ABCD-111

ABCD-111 => ABCD-111

ABCD-111A => ABCD-111A

I wrote the regex [A-Z]{4}-\d{3}[A-Z]? but it also includes the 'T' from the first example which I don't want to, i also tried

[A-Z]{4}-\d{3}(?=[A-Za-z\s]) which matches the first one correctly doesn't match the other. How do I write a regular expression that matches if and only if it terminates with only one repetition of a certain pattern


Solution

  • The following regex should work

    [A-Z]{4}-\d{3}(?:[A-Z]\b)?
    

    The end (?:[A-Z]\b)? matches a final uppercase char only if it is immediately followed by a word boundary (end of the word)