Search code examples
gmailprotocolsimapgmail-imap

What protocol does Google use for Gmail? (not IMAP or POP)


You can access gmail either using the web interface, Google's Android client or using IMAP. As far as I can tell, the web interface and the Android app uses a completely different protocol than IMAP -- they are not just interfaces on top of it. The reason I'm sure of that is because the Android app can without problem open a folder with 1m mail in < 3 seconds. No plain IMAP client can do that.

So my question is what is known about this secret protocol? Where is the reference documentation for it? Has it been reverse engineered? Does Google sanction its use?

arnt's answer provides an excellent method to test gmail's raw imap speed:

$ openssl s_client -host imap.gmail.com -port 993 -crlf 
...
* OK Gimap ready for requests from 12.34.56.78
$ a LOGIN ***@*** ***
a OK
$ c SELECT "[Gmail]/All mail" !!!!
* FLAGS (\Answered \Flagged \Draft \Deleted \Seen)
* OK [PERMANENTFLAGS (\Answered \Flagged \Draft \Deleted \Seen \*)] Flags permitted.
* OK [UIDVALIDITY 673376278] UIDs valid.
* 1142417 EXISTS
* 0 RECENT
* OK [UIDNEXT 1159771] Predicted next UID.
* OK [HIGHESTMODSEQ 8670601]
c OK [READ-WRITE] [Gmail]/All mail selected. (Success)

The command I've marked, c SELECT "[Gmail]/All mail" takes about 20 seconds to complete. Since that time is larger than it takes for the GMail app on my relatively underpowered Android phone to startup and load the All mail label which does it in less than 6 seconds even after I purged its caches. The web client is even faster.

Unless I'm missing something basic this proves "beyond reasonable doubt" that Google's GMail clients does not use IMAP since you never ever have to wait 20 seconds for any SELECT command to complete.


Solution

  • After more research, I've found that there exists an API for GMail: https://developers.google.com/gmail/api/ I don't think that API was released when I posted this question back in 2013.

    Using that API, I have created a demo program that fetches the 100 last mails of a label: https://gist.github.com/bjourne/37a9d15b03862003008a1b0169190dbe

    The relevant part of the program is:

    resource = service.users().messages()
    result = resource.list(userId = 'me', labelIds = [label]).execute()
    mail_ids = [m['id'] for m in result['messages']]
    
    start = time()
    mails = []
    batch = BatchHttpRequest()
    cb = lambda req, res, exc: mails.append(to_mail(res))
    for mail_id in mail_ids:
        get_request = resource.get(**headers_params(mail_id))
        batch.add(get_request, callback = cb)
    result = batch.execute()
    print('Took %.2f seconds' % (time() - start))
    

    It lists the last 100 messages sorted by date in a label (folder in IMAP terminology) containing over 570k messages.

    On my machine, this loop takes about 0.5 - 0.8 seconds. I can claim confidently that no pure IMAP client on the planet comes even close. Likely, IMAP won't ever get faster because it is a poor fit for how Google stores mail internally.

    So I'll answer my own question. This is the API they are using and it wasn't exposed earlier.