Search code examples
python-3.xsplitemail-headers

python3/email: parsing a list of email addresses with embedded commas?


I know how to use email.utils.parseaddr() to parse an email address. However, I want to parse a list of multiple email addresses, such as the address portion of this header:

Cc: "abc" <[email protected]>, "www, xxyyzz" <[email protected]>

In general, I know I can split on a regex like \s*,\s* to get the individual addresses, but in my example, the name portion of one of the addresses contains a comma, and this regex therefore will split the header incorrectly.

I know how to manually write state-machine-based code to properly split that address into pieces, and I also know how to code a complicated regex that would match each email address. I'm not asking for help in writing such code. Rather, I'm wondering if there are any existing python modules which I can use to properly split this email address list, so I don't have to "re-invent the wheel".

Thank you in advance.


Solution

  • Borrowing the answer from this question How do you extract multiple email addresses from an RFC 2822 mail header in python?

    msg = 'Cc: "abc" <[email protected]>, "www, xxyyzz" <[email protected]>'
    
    import email.utils
    
    print(email.utils.getaddresses([msg]))
    

    produces:

    [('abc', '[email protected]'), ('www, xxyyzz', '[email protected]')]