I need some help as I'm smashing my head on a wall.
I need to write a script to run periodically on lambda that will pull values from some sheets in google drive. The most straightforward way of finding these is to use the gdrive labels feature. We've enabled it, created the label, and tagged some files.
I can then use the api explorer to query for all files with that label using this query
'labels/LYBX-my-label-id-bFcb' in labels
I can also grab what my browser sent out and run it locally in postman or node/whatever. It works and returns the expected file listings.
However that is using my personal account credentials and when doing this "for real" we need to use a service account of course. So we created a GCP project with a service account, and I'm using the googleapiclient
python package. I store the secret for that service account in aws secretmanager, fetch it, and configure my instance of the drive
resource with it.
This all works. I can use it to call drive.files().get(...)
and drive.files().list(...)
and fetch data on files using all sorts of queries except the one I use above for the label. When I do that query I get back a 400 error that complains about the q
(query) parameter.
Now I've dropped down to the level of the url itself, and the exact GET request url that my python script logs works when I use my personal bearer token. I'm pretty sure therefore that this is not in fact a bad parameter issue and that's instead just a case of google being godawful at api design and returning crappy error codes.
So I'm thinking that this has to be a permission issue, but I have no clue what permissions are required to allow an account to search by gdrive labels nor how I would go about granting those permissions to a service account.
Another possible clue is that drive.files().listLabels(fileId="...")
on a file that I know has labels seems to fail, so again all points to some sort of permission being missing but its unclear which nor how to set those up on service accounts.
Note: Since I do not have visibility of your actual script, you can consider this answer as a starting point or reference for fixing the issue in your project. Hopefully, this will resolve your problem.
I conducted my own replication and successfully listed files by using a query based on the label ID with a service account through the process of user impersonation. This should be added in the credential creation phase, where you include a subject
parameter to enable the service account to impersonate a user (such as a super admin account or any domain account with the necessary role) for service account delegation.
from google.oauth2 import service_account
from googleapiclient.discovery import build
# Path to the service account JSON key file
KEY_FILE = 'sa.json'
# Create credentials from the service account key file & Build the service object
credentials = service_account.Credentials.from_service_account_file(
KEY_FILE, scopes=['https://www.googleapis.com/auth/drive',
'https://www.googleapis.com/auth/drive.file',
'https://www.googleapis.com/auth/drive.metadata',
'https://www.googleapis.com/auth/drive.metadata.readonly',
'https://www.googleapis.com/auth/drive.readonly'],
subject="irv@■■■■■■■■■■■■■■.■■■■");
service = build('drive', 'v3', credentials=credentials);
# List files under a label
label_id = "OTVglmjg5BxgxSevMiuLtr6VoaeDwyg66AIRNNEbbFcb";
results = service.files().list(q= f"'labels/{label_id}' in labels").execute()
results
I have created a test label and tagged it with two files in my drive:
After running the test script: