Search code examples
pythongoogle-cloud-platformgoogle-cloud-functionspython-asynciogoogle-cloud-scheduler

Cloud Scheduler invokes Cloud Function more than once during schedule


I currently have a Cloud Function that is executing some asynchronous code. It is making a Get request to an Endpoint to retrieve some data and then it storing that data into a Cloud Storage. I have set up the Cloud Function to be triggered using Cloud Scheduler via HTTP. When I use the test option that Cloud Function has, everything works fine, but when I set up Cloud Scheduler to invoke the Cloud Function, it gets invoked more than once. I was able to tell by looking at the logs and it showing multiple execution id's and print statements I have in place. Does anyone know why the Cloud Scheduler is invoking more than once? I have the Max Retry Attempts set to 0. There is a portion in my code where I use asyncio's create_task and sleep in order to put make sure the tasks get put into the event loop to slow down the number of requests and I was wondering if this is causing Cloud Scheduler to do something unexpected?

async with aiohttp.ClientSession(headers=headers) as session:
    tasks = []

    for i in range(1, total_pages + 1):
        tasks.append(asyncio.create_task(self.get_tasks(session=session,page=i)))
        await asyncio.sleep(delay_per_request)

Solution

  • For my particular case, when natively testing (using the test option cloud function has built-in) my Cloud Function was performing as expected. However, when I set up Cloud Scheduler to trigger the Cloud Function via a HTTP, it unexpectedly ran more than once. As @EdoAkse mentioned in original thread here my event with Cloud Scheduler was running more than once. My solution was to set up Pub/Sub topic that the Cloud Function subscribes to and that topic will trigger that Cloud Function. The Cloud Scheduler would then invoke that Pub/Sub Trigger. It is essentially how Google describes it in their docs.

    • Cloud Scheduler -> Pub/Sub Trigger -> Cloud Function