Search code examples
pythonregexstringverification

Regular expression match when specific digits AND words appear


I am quite new to regex, working on string verification where I want both conditions to be met. I am matching text containing 7digit numbers starting with 4 or 7 + string needs to contain one of the provided words.

What I managed so far:

\b((4|7)\d{6})\b|(\border|Order|Bestellung|bestellung|commande|Commande|ordine|Ordine|objednavku|Objednavku|objednavka|Objednavka)

Regex above correctly finds numbers but words are after OR statement which I would need to follow AND logic instead.

Could you please help me implement a change that would work as AND statement between digits and words?


Solution

  • You can use

    (?s)^(?=.*\b(?:order|Order|Bestellung|bestellung|commande|Commande|ordine|Ordine|objednavku|Objednavku|objednavka|Objednavka)\b).*\b([47]\d{6})\b
    

    If you can and want use a case insensitive matching with re.I, you can use

    (?si)^(?=.*\b(?:order|bestellung|commande|ordine|objednavk[ua])\b).*\b([47]\d{6})\b
    

    See the regex demo.

    This matches

    • ^ - start of string
    • (?=.*\b(?:order|Order|Bestellung|bestellung|commande|Commande|ordine|Ordine|objednavku|Objednavku|objednavka|Objednavka)\b) - a positive lookahead that matches any zero or more chars, as many as possible, up to any of the whole words listed in the group
    • .* - zero or more chars, as many as possible
    • \b([47]\d{6})\b - a 7-digit number as a whole word that starts with 4 or 7.

    Do not forget to use a raw string literal to define a regex in Python code:

    pattern = r'(?si)^(?=.*\b(?:order|bestellung|commande|ordine|objednavk[ua])\b).*\b([47]\d{6})\b'