Search code examples
pythonregexregex-lookaroundsregex-groupregex-negation

Quantifiers inside a positive look-behind using Python


I am trying to capture a group only if the lookbehind positive criteria satisfies.

Input string is either of the below

  1. Cats 5A, 5B and 5C
  2. Cat 5A

Regex:

  1. (?P<cat_num>(?:(?<=((\b[c|C]at)[s]? )))5A) ==> Incorrect because of quantifier present in lookbehind.
  2. (?P<cat_num>(?:(?<=((\b[c|C]at)(?=[s]?) )))5A) ==> Correct but does not match "5A" when Input 1 is given.

Requirement:

Using Python regex, I want to capture "5A" in capturing group cat_num when any of the above two inputs are given.


Solution

  • You don't need a lookbehind assertion. You could match what comes before the number, and capture the value in the named capturing group cat_num

    Make the s optional using [cC]ats?

    \b[cC]ats? (?P<cat_num>5A)\b
    

    Regex demo

    Or a bit broader match:

    \b[cC]ats? (?P<cat_num>\d+[A-Z]+)\b
    

    Regex demo