Search code examples
pythonjsontext

How can extract multiple email id's and phone numbers from a single text file with python?


Hello i have a large text file containing multiple information. I'd like to extract only e-mail id and phone numbers with a python program or a tool.

HTTP/1.1 200 OK

{"id":"269","first_name":"N S","last_name":"","balance":"0","phonecode":null,"mobile":null,"email":"[email protected]","verified":"0","password":""}


HTTP/1.1 200 OK


{"id":"303","first_name":"Devi","last_name":"Baruah","balance":"0","phonecode":null,"mobile":null,"email":"[email protected]","verified":"0","password":""}


HTTP/1.1 200 OK


{"id":"306","first_name":"Rashmi","last_name":"Kumari","balance":"24","phonecode":"91","mobile":"9xxxxxxx","email":"[email protected]","verified":"1","password":"xxxx"}


HTTP/1.1 200 OK

{"id":"308","first_name":"ashwini","last_name":"gokhale","balance":"7","phonecode":"1","mobile":"61xxxx","email":"[email protected]","verified":"1","password":"xxxxxxx"}


HTTP/1.1 200 OK

{"id":"307","first_name":"Rama","last_name":"De","balance":"0","phonecode":"91","mobile":"73xxxxxx","email":"[email protected]","verified":"1","password":"xxxx"}

Solution

  • Looks like that is a log from a webserver. If possible try have a cleaner file in first,
    anyhow:

    import json
    
    mandatory_keys = ['email', 'mobile']
    file_str = []
    out = []
    with open('test') as fd:
        file_str = [x.rstrip('\n') for x in fd.readlines() if x.startswith('{')]
    for j_str in file_str:
        try:
            j = json.loads(j_str)
            assert [x for x in mandatory_keys if x in j.keys()] == mandatory_keys, f'missing mandatory_keys'
            out.append({k: v for k, v in j.items() if k in mandatory_keys})
        except:
            raise ValueError('Something wrong with the json')
        
    print(out)
    

    Also you may want to use some json model validator as 'jsonschema' to substitute the assert line there and have a clear error message.
    Changing the mandatory_key list you can easily update you outpu.