Search code examples
google-cloud-run

Load Balancer + Cloud Run + Cloud SQL | Connection reset by peer


I am currently hosting a FastAPI (Python) app on Cloud Run. I also have a Cloud SQL DB to store some data. Both Cloud Run and Cloud SQL are on the same VPC network. I am using this code (connect_tcp_socket) to connect my Cloud Run instance to my Cloud SQL DB.

Everything was working fine until I decided to add a load balancer in front of my Cloud Run instance. All of sudden, querying endpoints through my load balancer that were communicating with my Cloud SQL DB would automatically and instantly get a ECONNRESET Connection reset by peer.

Querying the same endpoints through my direct Cloud Run URL still works perfectly fine.

So my conclusion is it has something to do with the addition of the load balancer. I read online that it might be due some timeouts differences between load balancer/cloud run/cloud sql, but unfortunately I am not the best at networking so I am coming here to see if anyone has any ideas on how to solve this issue.

Thank you for your help

Reading this Cloud Run documentation https://cloud.google.com/run/docs/troubleshooting#connection-reset-by-peer, I think my issue is:

If you're using an HTTP proxy to route your Cloud Run services or jobs egress traffic and the proxy enforces maximum connection duration, the proxy might silently drop long-running TCP connections such as the ones established using connection pooling. This causes HTTP clients to fail when reusing an already closed connection. If you intend to route egress traffic through an HTTP proxy, make sure you account for this scenario by implementing connection validation, retries and exponential backoff. For connection pools, configure maximum values for connection age, idle connections, and connection idle timeout.

The only logs I have in Cloud Run is when I request the endpoint I get my usual redirect, and then the client immediately get connection reset: "GET /v1/users HTTP/1.1" 307 Temporary Redirect


Solution

  • The issue was more silly than expected: /v1/users: Gets connection rest /v1/users/: Works This seems to be due that FastAPI does a temporary redirect to /v1/users/, which in return the load balancer doesnt seem to like