Search code examples
pythoncsvgoogle-sheetsgoogle-drive-api

Google Api/Python: Empty folder in Google Drive before uploading new files


I have this script made with big help from @Tanaike. It is doing two things:

  1. Delete old files in Google Drive folder by listing my files in local folder (CSVtoGD), checking names of that files and delete files in Google Drive named the same.
  2. Uploads new csv files to Google Drive

I have a problem with deleting old files in Google Drive. The script is uploading new files and delete old files in GD but If in my local folder (CSVtoGD) there is new file which has never been uploaded to Google Drive I receive error:

HttpError: <HttpError 404 when requesting https://www.googleapis.com/upload/drive/v3/files?fields=id&alt=json&uploadType=multipart returned "File not found: 19vrbvaeDqWcxFGwPV82APWYTmBMEn-hi.". Details: "[{'domain': 'global', 'reason': 'notFound', 'message': 'File not found: 19vrbvaeDqWcxFGwPV82APWYTmBMEn-hi.', 'locationType': 'parameter', 'location': 'fileId'}]">

Script:

import gspread
import os
from googleapiclient.discovery import build
from googleapiclient.http import MediaFileUpload

gc = gspread.oauth(credentials_filename='/users/user/credentials.json')
service = build("drive", "v3", credentials=gc.auth)
folder_id = '19vrbvaeDqWcxFGwPV82APWYTmBME'  


def getSpreadsheetId(filename, filePath):
    q = "name='" + filename + "' and mimeType='application/vnd.google-apps.spreadsheet' and trashed=false"
    res = service.files().list(q=q, fields="files(id)", corpora="allDrives", includeItemsFromAllDrives=True, supportsAllDrives=True).execute()
    items = res.get("files", [])
    if not items:
        print("No files found.")
        
        file_metadata = {
            "name": filename,
            "parents": [folder_id],
            "mimeType": "application/vnd.google-apps.spreadsheet",
        }
        media = MediaFileUpload(filePath + "/" + filename + ".csv")
        file = service.files().create(body=file_metadata, media_body=media, fields="id").execute()
        id = file.get("id")
        print("File was uploaded. The file ID is " + id)
        exit()

    return items[0]["id"]


filePath = '/users/user/CSVtoGD'
os.chdir(filePath)

files = os.listdir()

for filename in files:
    fname = filename.split(".")
    if fname[1] == "csv":
        oldSpreadsheetId = getSpreadsheetId(fname[0], filePath)
        print(oldSpreadsheetId)
        sh = gc.del_spreadsheet(oldSpreadsheetId)
        sh = gc.create(fname[0], folder_id)
        content = open(filename, "r").read().encode("utf-8")
        gc.import_csv(sh.id, content)

I was trying to make it work but none of my efforts work, so I thing I have to change the script to just simply delete all files which are in my Google Drive folder and then, upload the new files, but reading docs of Google API I really can't see where to start as the files ned to be deleted by checking their ID in google drive so I have to somehow list the files ID's and then delete all of them.

Simple in two words: How to delete ALL files in Google Drive folder and then upload new files from local folder (CSVtoGD) - uploading part of the script works great I have problem with deleting old files.


Solution

  • I believe your goal is as follows.

    • After all files in the folder were deleted, you want to upload the file.
    • From your showing script, you want to delete Google Spreadsheet files in the folder.

    In this case, how about the following modification?

    Modified script:

    import gspread
    import os
    from googleapiclient.discovery import build
    from googleapiclient.http import MediaFileUpload
    
    gc = gspread.oauth(credentials_filename='/users/user/credentials.json')
    service = build("drive", "v3", credentials=gc.auth)
    folder_id = '19vrbvaeDqWcxFGwPV82APWYTmBME'  
    
    
    def deleteAllFilesInFolder():
        q = "'" + folder_id + "' in parents and mimeType='application/vnd.google-apps.spreadsheet' and trashed=false"
        res = service.files().list(q=q, fields="files(id)", corpora="allDrives", includeItemsFromAllDrives=True, supportsAllDrives=True, pageSize=1000).execute()
        for e in res.get("files", []):
            gc.del_spreadsheet(e["id"])
    
    
    def uploadCSV(filename, filePath):
        file_metadata = {
            "name": filename,
            "parents": [folder_id],
            "mimeType": "application/vnd.google-apps.spreadsheet",
        }
        media = MediaFileUpload(filePath + "/" + filename + ".csv")
        file = service.files().create(body=file_metadata, media_body=media, fields="id", supportsAllDrives=True).execute()
        id = file.get("id")
        print("File was uploaded. The file ID is " + id)
    
    
    filePath = '/users/user/CSVtoGD'
    os.chdir(filePath)
    
    files = os.listdir()
    
    delete = False
    for filename in files:
        fname = filename.split(".")
        if fname[1] == "csv":
            if not delete:
                deleteAllFilesInFolder()
                delete = True
            uploadCSV(fname[0], filePath)
    
    • When this script is run, first, all Spreadsheet files in the folder are deleted. And then, the CSV files are uploaded as Google Spreadsheet.

    Note:

    • In this modified script, all Spreadsheet files in the folder are completely deleted. Please be careful about this. So, I would like to recommend testing using the sample Spreadsheet files.

    • If you want to delete all files including except for Spreadsheet, please modify q = "'" + folder_id + "' in parents and mimeType='application/vnd.google-apps.spreadsheet' and trashed=false" to q = "'" + folder_id + "' in parents and trashed=false".