I've been deploying a fastapi app on Google Cloud Run. Some of the api endpoints involve hitting external websites and downloading data. I'm finding that my calls to these endpoints take a long time, and have been able to measure the time it takes to get the request back from these external sites to be 4-5 seconds. The data I'm getting back is around 3-7mb, not a ton. When I test it locally it's around 0.7 seconds. I created a compute engine instance with the same characteristics and a container from the same image and tested this endpoint, and it takes around 1 second.
My cloud run settings are mostly default, cpu should be 2 cores, memory 8G. I know cold starts are a problem, but when I submit a request immediately after a previous one, it still takes 4-5 seconds. So regardless of cold start, the app pinging the site is the problem. Looking at CPU and memory metrics, nothing stands out, they don't go above 25%.
I'm serving the app via uvicorn with this line in my dockerfile CMD uvicorn api:app --port $PORT --host 0.0.0.0
. I tried a couple different approaches here too, but because I tested serving it the same way in the compute engine instance, I don't think this is the problem.
The only thing I can think of is that I have settings limiting my bandwidth, but I can't find anything related to this. As far as I'm aware I should have a default of 1 Gbps, while it's looking like I'm downloading at less than 1Mbps. Any ideas?
The network performance using the 2nd generation execution environment is improved. Try explicitly setting this in your Cloud Run service configuration: