Search code examples
pythongmail-api

Accessing link in email body using gmail API


I am using gmail api and python to access my gmail inbox. My email is stored in mime_msg. I want to access 'http://example.com/newpasswordid=exampleid12345' as seen below. How can I do that?

msg_str = base64.urlsafe_b64decode(full_message['raw'].encode('ASCII'))
mime_msg = email.message_from_bytes(msg_str)
print(mime_msg)
<a href =3D 'http://example.com/newpasswordid=exampleid12345'>Link1</a><br><br>.</td></tr>

<tr><td><i>=A92020 For more info please visit=
 <a href=3D" https://example2.com/">Link2</a=></i></td></tr>

Solution

  • If you have that text in the string mime_msg, and you just want to extract the URL, that's pretty simple. If you know your format is always going to be such that you want what's in the first pair of single quotes, you can use this code:

    import re
    
    mime_msg = """
    <a href =3D 'http://example.com/newpasswordid=exampleid12345'>Link1</a><br><br>.</td></tr>
    
    <tr><td><i>=A92020 For more info please visit=
     <a href=3D" https://example2.com/">Link2</a=></i></td></tr>
     """
    
    exp = re.compile(r"'(.*?)'")
    mime_msg = re.sub(r"[\n\r]+", '', mime_msg)
    m = exp.search(mime_msg)
    print(m.group(1))
    

    Result:

    http://example.com/newpasswordid=exampleid12345
    

    If you wanted to be able to deal with more complex mail bodies, you could make the regular expression more complex.