Search code examples
pythonregexmatching

Regex that matches all German and Austrian mobile phone numbers


I need a Python regex which matches to mobile phone numbers from Germany and Austria.

In order to do so, we first have to understand the structure of a phone number: enter image description here

  • a mobile number can be written with a country calling code in the beginning. However, this code is optional!
  • if we use the country calling code the trunk prefix is redundant!
  • The prefix is composed out of the trunk prefix and the company code
  • The prefix is followed by an individual and unique number with 7 or 8 digits, respectivley.

List of German prefixes:

  • 0151, 0160, 0170, 0171, 0175, 0152, 0162, 0172, 0173, 0174, 0155, 0157, 0159, 0163, 0176, 0177, 0178, 0179, 0164, 0168, 0169

List of Austrian prefixes:

  • 0664, 0680, 0688, 0681, 0699, 0664, 0667, 0650, 0678, 0650, 0677, 0676, 0660, 0699, 0690, 0665, 0686, 0670

Now that we know all rules to build a regex, we have to consider, that humans sometimes write numbers in a very strange ways with multiple whitespaces, / or (). For example:

  • 0176 98 600 18 9
  • +49 17698600189
  • +(49) 17698600189
  • 0176/98600189
  • 0176 / 98600189
  • many more ways to write the same number

I am looking for a Python regex which can match all Austian and German mobile numbers.

What I have so far is this:

^(?:\+4[39]|004[39]|0|\+\(49\)|\(\+49\))\s?(?=(?:[^\d\n]*\d){10,11}(?!\d))(\()?[19][1567]\d{1,2}(?(1)\))\s?\d(?:[ /-]?\d)+

Solution

  • You can use

    (?x)^          # Free spacing mode on and start of string
     (?:           # A container group:
       (\+49|0049|\+\(49\)|\(\+49\))? [ ()\/-]*  # German: country code
       (?(1)|0)1(?:5[12579]|6[023489]|7[0-9])    #         trunk prefix and company code
     |                                           # or
       (\+43|0043|\+\(43\)|\(\+43\))? [ ()\/-]*  # Austrian:  country code
       (?(2)|0)6(?:64|(?:50|6[0457]|7[0678]|8[0168]|9[09])) # trunk prefix and company code
     )
     [ ()\/-]*   # zero or more spaces, parens, / and -
     \d(?:[ \/-]*\d){6,7} # a digit and then six or seven occurrences of space, / or - and a digit
     \s* # zero or more whites
    $ # end of string
    

    See the regex demo.

    A one-line version of the pattern is

    ^(?:(\+49|0049|\+\(49\)|\(\+49\))?[ ()\/-]*(?(1)|0)1(?:5[12579]|6[023489]|7[0-9])|(\+43|0043|\+\(43\)|\(\+43\))?[ ()\/-]*(?(2)|0)6(?:64|(?:50|6[0457]|7[0678]|8[0168]|9[09])))[ ()\/-]*\d(?:[ \/-]*\d){6,7}\s*$
    

    See this demo.

    How to create company code regex

    1. Go to the Optimize long lists of fixed string alternatives in regex
    2. Click the Run code snippet button at the bottom of the answer to run the last code snippet
    3. Re-size the input box if you wish
    4. Get the list of your supported numbers, either comma or linebreak separated and paste it into the field
    5. Click Generate button, and grab the pattern that will appear below.