Search code examples
pythonauthenticationgoogle-sheets-apigspread

Better way to get access to Google Sheet with Python


How to automatize the process of getting access to Google spreadsheets?

Right now we use gspread and oauth2client.service_account to get an access to Google spreadsheet. It works fine, but using OAuth2 credentials makes us manually share every single spreadsheet to "client_email" from credentials json-file.

import gspread
from oauth2client.service_account import ServiceAccountCredentials
scope = [
            'https://www.googleapis.com/auth/spreadsheets',
        'https://www.googleapis.com/auth/drive'
    ]
credentials = 
ServiceAccountCredentials.from_json_keyfile_name('path.json', scope)
    gs = gspread.authorize(credentials)

That works, but how to modify?

So the desired outcome is: somebody shares a spreadsheet with me and I can start to work with it immediately in Python. Is it possible? Maybe we can use some triggers from incoming emails with the information about sharing or something similar?


Solution

  • You can try this script. It has a few sections we can differentiate:

    1. Requesting access to Drive and Gmail. As you see we use the full drive scope instead of drive.file. This is because there is an existing bug that causes drive.file to crash(1), so meanwhile we have to use this.
         from __future__ import print_function
            import pickle
            import sys
            import os.path
            from googleapiclient.discovery import build
            from google_auth_oauthlib.flow import InstalledAppFlow
            from google.auth.transport.requests import Request
    
            # If modifying these scopes, delete the file token.pickle.
            SCOPES = ['https://www.googleapis.com/auth/drive', 
                  'https://www.googleapis.com/auth/gmail.modify']
    
            creds = None
            # The file token.pickle stores the user's access and refresh tokens, and is
            # created automatically when the authorization flow completes for the first
            # time.
            if os.path.exists('token.pickle'):
                with open('token.pickle', 'rb') as token:
                    creds = pickle.load(token)
            # If there are no (valid) credentials available, let the user log in.
    
            if not creds or not creds.valid:
                if creds and creds.expired and creds.refresh_token:
                   creds.refresh(Request())
                else:
                    flow = InstalledAppFlow.from_client_secrets_file(
                        'credentials.json', SCOPES)
                    creds = flow.run_local_server()
                # Save the credentials for the next run
                with open('token.pickle', 'wb') as token:
                    pickle.dump(creds, token)
            mail_service = build('gmail', 'v1', credentials=creds)
            drive_service = build('drive','v3', credentials=creds)
    
    1. Declaring some variables. There is no issue here, just declaring the variables where we will keep the mail ids, the file name and the file name formatted for our needs.
         mail_ids = []
         file_name = []
         name_string = []
    
    1. Get the emails. We will only take the unread emails from drive-shares-noreply. After this we will mark them as “read” so we won’t take them the next time we execute the script.
        def get_emails(mail_ids):
            user_id = 'me' #Or your email
            query = 'from:drive-shares-noreply@google.com, is:UNREAD' #Will search mails from drive shares and unread
            response = mail_service.users().messages().list(userId=user_id, q=query).execute()
            items = response.get('messages', [])
            if not items:
                print('No unread mails found')
                sys.exit()
            else:
                for items in items:
                    mail_ids.append(items['id'])
                for mail_ids in mail_ids:
                    mail_service.users().messages().modify(userId=user_id, id=mail_ids, body={"removeLabelIds":["UNREAD"]}).execute() #Marks the mails as read
    
    1. Get the file names of the emails. The syntax of the Subject of the sharing sheets email is “Filename - Invitation to edit”, so we will take the subject of each email, and we will format the string later.
    def get_filename(mail_ids, file_name):
        user_id = 'me'
        headers = []
        for mail_ids in mail_ids:
            response = mail_service.users().messages().get(userId=user_id, id=mail_ids, format="metadata", metadataHeaders="Subject").execute()
            items = response.get('payload', [])
            headers.append(items['headers'])
        length = len(headers)
        for i in range(length):
            file_name.append(headers[i][0]['value'])
    
    def process_name(file_name, name_string):
        for file_name in file_name:
            name_string.append(str(file_name).replace(" - Invitation to edit", ""))
    
    
    1. Give permissions to the client_email
    def give_permissions(name_string):
        for name_string in name_string:
            body = "'{}'".format(name_string)
            results = drive_service.files().list(q="name = "+body).execute()
            items = results.get('files', [])
            if not items:
                print('No files found.')
                sys.exit()
            else:
                print('Files:')
                for item in items:
                    print(u'{0} ({1})'.format(item['name'], item['id']))
                    file_id = item['id']
    
        user_permission = {
            'type': 'user',
            'role': 'writer',
            'emailAddress': 'your_client_email'
        }
        drive_service.permissions().create(body=user_permission, fileId=file_id).execute()
    
    1. And then we just have to call the functions
    get_emails(mail_ids)
    get_filename(mail_ids, file_name)
    process_name(file_name, name_string)
    give_permissions(name_string)
    

    There is no way to trigger this script for each new email received, but you can trigger it with a timer or something like that and it will search for new emails.

    (1) The drive.file scope only works with certain files, according to the last update of the documentation