Search code examples
pythonregexregex-groupregex-look-ahead

Regex - Ignore if group has prefix


I am trying to capture 8 digit phone numbers in free text. This should be ignored if a particular string appears before.

My regex:

(\b(\+?001|002)?[-]?\d{4}(-|\s)?\d{4}\b)

To Capture:

+001 12345678
12345678

Not Capture:

TTT-12345678-123
TTT-12345678

I am trying to use negative look behind as below example:

\w*(?<!foo)bar

But the above works only if the regex doesn't have subsequent groups.


Solution

  • You may use

    (?<!TTT-)(?<!\w)(?:\+?001|002)?[-\s]?\d{4}[-\s]?\d{4}\b
    

    See the regex demo

    Details

    • (?<!TTT-) - no TTT- allowed immediately on the left
    • (?<!\w) - no word char allowed immediately on the left
    • (?:\+?001|002)? - an optional non-capturing group matching 1 or 0 occurrences of +001, 001 or 002
    • [-\s]? - an optional - or whitespace
    • \d{4} - any four digits
    • [-\s]?\d{4} - - an optional - or whitespace and any four digits
    • \b - a word boundary.

    If the number can be glued to a word char on the right, replace the \b word boundary with the right-hand digit boundary, (?!\d).