Search code examples
pythonregexregex-group

Regex for phone numbers


I want to catch the phone number from a text using regex.

Examples:

I have this regex which finds the phone number very well: ^((\(?\+45\)?)?)(\s?\d{2}\s?\d{2}\s?\d{2}\s?\d{2})$

and it catches all the numbers below well.

But I cannot catch the "tel.", "tlf", "mobil:", etc that could be before the number. And also, if another letter comes after the last digit, it doesn't take number anymore, but it should.

These examples are not covered:

tel.: +45 09827374, +45 89895867, some kind of text... 
mobil: +45 20802020, +45 20802001,
tlf.: +45 5555 1212 
tlf: +4567890202Girrafe

If helpful, I found this regex: '\btlf\b\D*([\d\s]+\d)' which can extract the number and the tlf and also stop before it finds a new character which is represented by a letter.

So I tried to combine them and I obtained this but it doesn't work: \b(tlf|mobil|telephone|mobile|tel)\b\D*(^((\(?\+45\)?)?)(\s?\d{2}\s?\d{2}\s?\d{2}\s?\d{2})$)

Expected output:

  • for input: "tel.: +45 09827374, +45 89895867, some kind of text..." --> output: "tel.: +45 09827374" and "+45 89895867"
  • for input: "mobil: +45 20802020, +45 20802001," --> output: "mobil: +45 20802020" and "+45 20802001" or "mobil: +45 20802020, +45 20802001" is ok too
  • for input: "tlf +45 5555 1212" --> output: "tlf +45 5555 1212"
  • for input: "tlf: +4567890202Girrafe" --> output: "tlf: +4567890202"
  • for input: "+4567890202" --> output: "+4567890202"

Can you help me please?


Solution

  • If you want the full match only:

    (?:\b(?:tlf|mobile?|tel(?:ephone)?)[.:\s]+)?(?:\(\+45\)|\+45)?\s*\d{2}(?:\s?\d{2}){3}(?!\d)
    

    The pattern matches:

    • (?: Non capture group
      • \b A word boundary to prevent a partial word match
      • (?:tlf|mobile?|tel(?:ephone)?) match one of the alternatives
      • [.:\s]+ match 1+ occurrences of either . : or a whitespace char
    • )? Close the on capture group and make it optional
    • (?:\(\+45\)|\+45)? Optionally match either +45 or (+45)
    • \s*\d{2}(?:\s?\d{2}){3} Match 3 times 2 digits with an optional whitespace char in between
    • (?!\d) Negative lookahead, assert not a digit directly to the right

    See a regex demo