Search code examples
pythonimaplibemail-headers

Parsing Message-ID header returned by imaplib


I'm fetching the messageid from emails in Gmail via IMAP.

This code:

messageid = m.fetch(num, '(BODY[HEADER.FIELDS (MESSAGE-ID)])')
print messageid

returns this:

[('1 (BODY[HEADER.FIELDS (MESSAGE-ID)] {78}', 'Message-ID: <actualmessageid@mail.mail.gmail.com>\r\n\r\n'), ')']

How would I parse just the actual message-id out of that?


Solution

  • You can also achieve what you want using the email module's HeaderParser.parsestr() function (same API as Parser but doesn't worry about the email's body) and the parseaddr() function.

    >>> from email.parser import HeaderParser
    >>> from email.utils import parseaddr
    
    >>> hp = HeaderParser()
    
    >>> response = [('1 (BODY[HEADER.FIELDS (MESSAGE-ID)] {78}',
                     'Message-ID: <actualmessageid@mail.mail.gmail.com>\r\n\r\n'), ')']
    
    >>> header_string = response[0][1]
    
    >>> header_string
    'Message-ID: <actualmessageid@mail.mail.gmail.com>\r\n\r\n'
    
    >>> header = hp.parsestr(header_string)
    
    >>> header
    <email.message.Message instance at 0x023A6198>
    
    >>> header['message-id']
    '<actualmessageid@mail.mail.gmail.com>'
    
    >>> msg_id = parseaddr(header['message-id'])
    
    >>> msg_id
    ('', 'actualmessageid@mail.mail.gmail.com')
    
    >>> msg_id[1]
    'actualmessageid@mail.mail.gmail.com'
    

    Thus:

    from email.parser import HeaderParser
    from email.utils import parseaddr
    
    hp = HeaderParser()
    
    def get_id(response):
        header_string = response[0][1]
        header = hp.parsestr(header_string)
        return parseaddr(header['message-id'])[1]
    
    response = [('1 (BODY[HEADER.FIELDS (MESSAGE-ID)] {78}',
                 'Message-ID: <actualmessageid@mail.mail.gmail.com>\r\n\r\n'), ')']
    
    
    print(get_id(response))
    

    returns:

    actualmessageid@mail.mail.gmail.com