Search code examples
javapythonphone-numberlibphonenumber

Retrieve Phone number from text file irrespective of format


I want to retrieve phone numbers from a text file. I am trying to use the third party python version of google's phone number library. But it is not able to retrieve numbers with spaces between them for ex. - "+91 – 9871127622".

Is there any way around it?

If not, I wanted to install Google's original library of phone numbers but I don't know how to install and include it in my code(no instructions are provided)

My python code is as follows:

#!usr/bin/env python
import phonenumbers
import os
import re
import sys

file_name = sys.argv[1]
fp =open(file_name,"r")

for line in fp:
    for match in phonenumbers.PhoneNumberMatcher(line,None):
        print match

Solution

  • You can use a regular expression to quickly cleanup unwanted characters in your input.

    My regular expression is this: [^\\d]. It matches any non-digits in the input. I'm replacing each matching character with an empty string. So, we will be left with only the digits in the end.

    Here's something to get you started:

    public class CleanPhoneNumber {
    
        public static void main(String[] args) {
            String inputPhoneNumber = "+91 – 9871127622";
            String validPhoneNumber = cleanup(inputPhoneNumber);
            System.out.println(validPhoneNumber );
        }
    
        public static String cleanup(String inputPhoneNumber) {
            return inputPhoneNumber.replaceAll("[^\\d]", "");
        }
    
    }
    

    You can further improve the regular expression.

    PS: I'm not into Python, but you can use a similar approach there too.

    Update, based on Ole V.V.'s comment:

    public static String cleanup(String inputPhoneNumber) {
        String cleanedUp = inputPhoneNumber.replaceAll("[^\\d]", "");
        if(inputPhoneNumber.startsWith("+")){
            return "+" + cleanedUp;
        }
        return cleanedUp;
    }
    

    Hope this helps!