Search code examples
pythonregex-alternation

Python regex: Using Alternation for sets of words with delimiter


I want to match a string for which the string elements should contain specific characters only:

  • First character from [A,C,K,M,F]
  • Followed by a number (float or integer). Allowed instances: 1,2.5,3.6,9,0,6.3 etc.
  • Ending at either of these roman numerals [I, II, III, IV, V].

The regex that I am supplying is the following

bool(re.match(r'(A|C|K|M|F){1}\d+\.?\d?(I|II|III|IV|V)$', test_str))

"(I|II|III|IV|V)" part will return true for test_str='C5.3IV' but I want to make it true even if two of the roman numerals are present at the same time with a delimiter / i.e. the regex query should retrun true for test_str='C5.3IV/V' also.

How should I modify the regex?

Thanks


Solution

  • Try this:

    bool(re.match(r'[ACKMF]\d+\.?\d?(I|II|III|IV|V)(/(I|II|III|IV|V))*$', test_str))
    

    I also changed the start of your expression from (A|C|K|M|F){1} to [ACKMF] Characters between square brackets form a character class. Such a class matches one character out of a range of options. You most commonly see them with ranges like [A-Z0-9] to match capital letters or digits, but you can also add individual characters, as I've done for your regex.