I am trying to retrieve all files in Google Drive, but only those in 'My Drive'. I tried including "'me' in owners" in the query, but that gives me tons of files in shared folders where I am the owner. I tried "'root' in parents" in the query, but that gives me back only files directly under My Drive, while I need also files under subfolders and subolders of those subolders, etc.
I tried also setting the drive parameter but in this case the query does not retrieve anything at all:
driveid = service.files().get(fileId='root').execute()['id']
page_token = None
my_files = list()
while True:
results = service.files().list(q= "'myemail@gmail.com' in owners",
pageSize=10,
orderBy='modifiedTime',
pageToken=page_token,
spaces = 'drive',
corpora='drive',
driveId = driveid,
includeItemsFromAllDrives=True,
supportsAllDrives=True,
fields="nextPageToken, files(id, name)").execute()
items = results.get('files', [])
my_files.extend(items)
page_token = results.get('nextPageToken', None)
if page_token is None:
break
print(len(my_files))
# This prints: 0
How can I get this to work?
I guess the other possibility would be to start from root, get children and recursively navigate the full tree, but that is going to be very slow. The same applies if I get all the files and then find out all the parents to check if they are in My Drive or not, I have too many files and that takes hours.
Thanks in advance!
The first request you make would be to parents in root. This is the top level of your drive account.
results = service.files().list(q= "root in parents").execute()
Now you will need to loop though the results here in your code. Check for mime type being a directory 'application/vnd.google-apps.folder'
Everything that is not a directory should be a file sitting in the root directory of your Google drive account.
Now all those directories that you found what you can do is make a new request to find out the files in those directories
results = service.files().list(q= "directorIDFromLastRequest in parents").execute()
You can then loop though getting all of the files in each of the directories. Looks like its a known bug Drive.Files.list query throws error when using "sharedWithMe = false"
You can also set SharedWithMe = false
in the q parameter and this should remove all of the files that have been shared with you. Causing it to only return the files that are actually yours.
This used to work but i am currently having issues with it while i am testing.
The thing is as mentioned files.list will by default just return everything but in no order so technically you could just do a file.list and add the sharedwithme and get back all the files and directories on your drive account. By requesting pagesize of 1000 you will then have fewer requests. Then sort it all locally on your machine once its down.
The other option would be to do as i have written above and grab each directory in turn. This will probably result in more requests.