Search code examples
pythonpandasregexconditional-statementsseries

If then conditional logic in regex in python


I am attempting to implement a conditional statement within regex, applied via the pandas.Series.str.extractall method. Given the reading I've done here, this seems like a pretty easy problem to solve, but I am still getting stuck...

I have the following regex in the Pythex tester:

(a)(?(1)b|c)

As I understand it, (a) is my first test group. The conditional block (?(1)b|c) should attempt to match "b" if my first test group is a match, or else it will attempt to match "c". The results I am hoping for are as follows:

  1. "b" = No Match
  2. "ab" = Match
  3. "c" = Match
  4. "ac" = No Match

The (a)(?(1)b|c) statement achieves 1, 2, and 4, but it misses 3... Any tips?

Thank you!


Solution

  • To get the matches, you don't need a conditional.

    If a, then also match b.. else match c, can be written as:

    \b(?:ab|c)\b
    

    Regex demo