google-cloud-platform grpc google-cloud-run

GCP Cloud Run and gRPC streaming costs

I'm deploying a gRPC service to Cloud Run and this service establishes bi-directional stream with the client and should be long-lived. I see that unless I choose to always allocate my instances, Cloud Run's pricing model is based on requests. A part of that calculation seems to be the execution time per request. I'm looked for this but it's not clear to me whether if a long-lived gRPC stream is considered a running execution. That is, am I being charged for a long lived gRPC stream across the entire duration of that stream? Or is the stream transparent to GCP and I'm only charged per message sent inside the stream?

Solution

Just imagine what the Google pricing model is: when you use resource, you pay for it.

With that in mind, you can deduce the Cloud Run pricing: when you handle a request, you use CPU to process it, outside of the request, the CPU is throttled (for you to be used by other workload).

Now, think about stream connection: You must maintain a tunnel and a communication to be able to stream in 2 directions. Therefore, you must keep the CPU up to be able to handle it. And, therefore, you will pay the full cost as long as your streaming channel is open.

Of course, if a same instance can manage multiple stream channel in the same time, you will pay only the resource of the instance. However, if you set the concurrency to 1, to handle only 1 request per instance, you will multiply the cost.