Search code examples
pythongoogle-apigmailgmail-apigoogle-api-python-client

Fastest way to get emails from Gmail and write them into file


I'm making a script which gets n emails from my gmail inbox and writes n subjects into a text file. While this works fine currently. I am looking for a way to get for instance 20 emails in JSON format with only one call, instead of going one by one inside a loop.

For the moment I have this:

from __future__ import print_function
import pickle
import os.path

from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request

# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']

def main():
    """Shows basic usage of the Gmail API.
    Lists the user's Gmail labels.
    """
    creds = None
    # The file token.pickle stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)

    service = build('gmail', 'v1', credentials=creds)

    resultsMessages = service.users().messages().list(userId='me', labelIds=['INBOX']).execute()
    messages = resultsMessages.get('messages', [])

    f = open("output.txt", "a")

    message_count = int(input("How many messages do you want to write?"))

    if not messages:
        print("no messages found")
    else:
        print("messages:")
        for i, message in enumerate(messages[:message_count]):
            f.write("message "+ str(i))
            msg = service.users().messages().get(userId='me', id=message['id']).execute()
            headers = msg["payload"]["headers"]
            subject = [i['value'] for i in headers if i["name"] == "Subject"]
            f.write("subject: "+subject[0])
            f.write("\n")
    f.close()

if __name__ == '__main__':
    main()

This basically gets the ID of 100 emails and then it goes email per email getting the subject and writing it into a file. It works fine but I would like to find a faster way. Is there any way I can get n emails with JSON format from the server with only one call? I imagine the bottleneck on my code is the call msg = service.users().messages().get(userId='me', id=message['id']).execute() which is executed in the loop.

Thank you very much


Solution

  • I just want to know if there is a way to get for instance 20 emails in JSON format with only one call.

    If you check the documentation for the gmail api you will find that there is only one method which returns the details about an email that being Messages.get. Message get takes a single message id as a parameter and returns the information back about that single message.

    There is no way to send multiple message ids to message.get.

    if you are looking for a way of reducing network traffic you should look into batching the request which would allow you to send up to 100 message.get into a single http request.

    You will still be charged the quota cost for each of the requests that you send in batch.