Search code examples
pythonpygsheets

Pygsheets: cannot upload more than 200-ish files to Drive with Google Service Account. Get 403 Error


I'm using pygsheets to work with Google Spreadsheets and Drive. I need to upload more than 10000 files to the company Drive, but each time when I use service account, I get an error after uploading 200-ish files. The error as follows:

 {
 "error": {
  "errors": [
   {
    "domain": "usageLimits",
    "reason": "dailyLimitExceededUnreg",
    "message": "Daily Limit for Unauthenticated Use Exceeded. Continued use requires signup.",
    "extendedHelp": "https://code.google.com/apis/console"
   }
  ],
  "code": 403,
  "message": "Daily Limit for Unauthenticated Use Exceeded. Continued use requires signup."
 }
}

I checked my Cloud console. I don't overuse the limitations on the amount of requests per any given time. I wait 10 seconds between each request.

Attaching relevant code below:

def upload_cluster(sheets, cluster_array):
    max_global = 0
    max_local = 0
    max_traffic = 0
    sum_global = 0
    sum_local = 0
    sum_traffic = 0
    keywords = len(cluster_array)
    cluster_name = ''
    for elem in cluster_array:
        if elem.get('Global volume') >= max_global:
            cluster_name = elem.get('Keyword')
            max_global = elem.get('Global volume')
        if elem.get('Volume') > max_local:
            max_local = elem.get('Volume')
        if elem.get('Traffic potential') > max_traffic:
            max_traffic = elem.get('Traffic potential')
        sum_global += elem.get('Global volume')
        sum_local += elem.get('Volume')
        sum_traffic += elem.get('Traffic potential')
    book = sheets.create(title=re.sub('\"', '', cluster_name), folder='FOLDER_ID')
    link = f'HYPERLINK(\"https://docs.google.com/spreadsheets/d/{book.id}\",\"{cluster_name}\")'
    dataframe = pandas.DataFrame(cluster_array)
    out_sheet = book.worksheet(property='index', value=0)
    out_sheet.set_dataframe(df=dataframe, start='A1', extend=True, copy_head=True, copy_index=False)
    cluster_summary = {
        'Cluster': link,
        'Volume': sum_local,
        'Global volume': sum_global,
        'Traffic potential': sum_traffic,
        'Max volume': max_local,
        'Max global volume': max_global,
        'Max traffic potential': max_traffic,
        'Queries': keywords
    }
    return cluster_summary

def main():
    gsheets = pygsheets.authorize(service_account_file='service-account.json')
    for i in range(len(output_keywords) - 1):
        if not output_keywords[i].get('Clustered'):
            cluster = []
            cluster.append(output_keywords[i])
            cluster_max = get_vol(output_keywords[i].get('Global volume'))
            cluster_urls = output_keywords[i].get('URLs')
            output_keywords[i]['Clustered'] = True
            print(f'Added to cluster: {cluster[-1]}')
            for j in range(len(output_keywords)):
                if not output_keywords[j].get('Clustered'):
                    if len(set(cluster_urls) & set(output_keywords[j].get('URLs'))) >= 4:
                        cluster.append(output_keywords[j])
                        output_keywords[j]['Clustered'] = True
                        print(f'Added to cluster: {cluster[-1]}')
            print('Uploading cluster...')
            clusters.append(upload_cluster(gsheets, cluster))
            sleep(5)
            print(f'Uploaded: {clusters[-1]}')
            cluster = []

I've tried authorizing via client secret too and it seems to work fine, but unfortunately, i cannot see the uploaded files in the folder.

Service account HAS access to the Drive folders.


Solution

  • After some more testing I've noticed that the script chokes on one specific file due to that file having ' in it's name. After using RegEx to remove ' from the file name it didn't choke. Weird, but okay. Anyone has any thoughts on why this might have happened? I thought Google Drive doesn't have forbidden characters for naming files.