I would like to list all message ID's from a Gmail account utilizing the Gmail API. So far I've been able to list the first and second page of message ID's. I know I have to use the pageToken to get to the next page of results, but I can't figure out how to restructure my code so I'm not using 1,2,3, etc variables to call each page. Source code is below.
get_email_ids.py:
from __future__ import print_function
import os.path
from collections import Counter
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
# If modifying these scopes, delete the file token.json.
SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']
def main():
"""Shows basic usage of the Gmail API.
"""
creds = None
user_id = "me"
# The file token.json stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.json'):
creds = Credentials.from_authorized_user_file('token.json', SCOPES)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open('token.json', 'w') as token:
token.write(creds.to_json())
service = build('gmail', 'v1', credentials=creds)
### Call the Gmail API
### Show messages
token = ''
messages = service.users().messages().list(userId=user_id,pageToken=token).execute().get('messages', [])
token = service.users().messages().list(userId=user_id,pageToken=token).execute().get('nextPageToken', [])
print(messages,token)
messages2 = service.users().messages().list(userId=user_id,pageToken=token).execute().get('messages', [])
token2 = service.users().messages().list(userId=user_id,pageToken=token).execute().get('nextPageToken', [])
print(messages2,token2)
if __name__ == '__main__':
main()
Results of get_email_ids.py (shortened):
[{'id': '179ed5ae720de1f6', 'threadId': '179ed5ae720de1f6'}, ... {'id': '179ba226644a079a', 'threadId': '17972318184138fa'}] 09573475999783117733
[{'id': '179b9f8852d3b09d', 'threadId': '179b9f8852d3b09d'}, ... {'id': '1797fa390caa3454', 'threadId': '1797fa390caa3454'}] 07601624978802434502
I can't test it but I would use the same variables messages
, token
without 1,2,3
and results I would add to the same list with all messages. And I would run it in some loop.
Something like this
all_messages = []
token = ''
while True:
messages = service.users().messages().list(userId=user_id, pageToken=token).execute().get('messages', [])
token = service.users().messages().list(userId=user_id, pageToken=token).execute().get('nextPageToken', [])
print(messages, token)
if not messages:
break
#all_messages.extend(messages) # `extend` or `+=`, not `append`
all_messages += messages # `extend` or `+=`, not `append`
I only don't know how API informs that there is no more messages - maybe it returns empty list or maybe it gives empty token, or maybe it raise error.
EDIT:
Information for other users: as @emmalynnh mentioned in comment
When there are no more messages it gives an empty token
and the API will return a 400 error if you try to request.