firebase firebase-hosting google-cloud-run google-cloud-scheduler

Cloud Scheduler Job hits HTTPS endpoint and logs as it had failed (status 502) but server is returning a success response (status 200)

I have a cloud scheduler job that is supposed to hit my API every hour to update some prices. The jobs takes about 80 seconds to run.

Here is what it does:

POST https://www.example.com/api/jobs/update-prices

My app is hosted on Firestore and the /api/jobs/** is handled by a cloud run service named server-jobs.

firebase.json

{
  "source": "/api/jobs/**",
  "run": { "serviceId": "server-jobs", "region": "europe-west1" }
},

Almost everything is working fine. See the logs from my server-jobs service:

You can see the Google-Cloud-Scheduler user agent. And you can see the all responses are 200 OK.

But on Cloud Scheduler console, it seems like the job has failed:

See the cloud scheduler logs:

It logs the following error:

Status 502 "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished"

But that 502 is not comming from my server at all.

MORE DETAILS

I've got 2 different custom domains connected to that Firebase Hosting. And that error is only happening for one of those domains. And I have two cloud scheduler jobs (configured exactly the same) to each of the connected domains.

JOB 1 https://www.example1.com/api/jobs/update-prices SUCCEEDS (job takes ~35 seconds)
JOB 2 https://www.example2.com/api/jobs/update-prices FAILS (job takes ~80 seconds)

I don't think the job duration has anything to do with it. Because at first, I was running that exact same job, but hitting the Cloud Run URL directly (without passing through Firebase Hosting proxy) and it was logging as successful. This error only began once I've started hitting the connected Firebase Hosting domains, instead of the Cloud URL service URL directly.

Those domains are registered in different places and different countries. But both are properly configured and are working fine in production.

And every log on my server is successful 200.

UPDATE

I've just noticed a pattern:

It seems that the errors is triggered exactly 1 minute (60 seconds) after the job has initiated. So the duration of the job probably has something do with it. Because the job that takes 80 seconds is "failing", and the other that takes 35 seconds is succeeding. Is this a timeout issue? What is returning the 502? Because my server is working as expected as you can see from the logs.

UPDATE 2

I've just checked the jobs with gcloud scheduler jobs descript JOB_NAME and both of them are configured with attemptDeadline: 180s. So my job durations should not be a problem, since both of them are under 180 seconds.

UPDATE 3

As I had suspected, the job duration seems to be the issue here.

I've done the following tests on my API handler function:

Wait 59 seconds and res.sendStatus(200)
RESULT: Cloud Scheduler shows both jobs as SUCCESS

And also:

Wait 65 seconds and res.sendStatus(200)
RESULT: Cloud Scheduler shows both jobs as FAILED

It seems that there is a 60 second threshold along the chain:

Cloud Scheduler
Firebase Hosting
Cloud Run

Solution

I guess Firebase Hosting is the culprit here.

There is a default 60-second request timeout on Firebase Hosting. That's why the job sees the error even when it's succeeding on the server.

From: https://firebase.google.com/docs/hosting/functions

Note: Firebase Hosting is subject to a 60-second request timeout. Even if you configure your HTTPS function with a longer request timeout, you'll still receive an HTTPS status code 504 (request timeout) if your function requires more than 60 seconds to run. To support dynamic content that requires longer compute time, consider using an App Engine flexible environment.