Search code examples
expresshttphttp2google-cloud-runmultiplexing

How do Cloud Run instances perceive multiple requests from HTTP2 connections?


HTTP2 has this multiplexing feature.

From this [answer](Put simply, multiplexing allows your Browser to fire off multiple requests at once on the same connection and receive the requests back in any order.) we get that:

Put simply, multiplexing allows your Browser to fire off multiple requests at once on the same connection and receive the requests back in any order.

Let's say I split my app into 50 small bundled files, to take advantage of the multiplex communication.

My server is an express app hosted in a Cloud Run instance.

Here is what Cloud Run says about concurrency:

By default Cloud Run container instances can receive many requests at the same time (up to a maximum of 250).

So, if 5 users hit my app at the same time, does it mean that my instance will be max'ed out for a brief moment?

Because each browser (from the 5 users) will make 50 requests (for the 50 small bundled files), resulting on a total of 250.

Does the fact that multiplex traffic occurs on over the same connection change any thing? How does it work?

Does it mean that my cloud run will perceive 5 connections and my express server will perceive 250 requests? I think I'm confused about the request expression in these 2 perspectives (the cloud run instance and the express server).


Solution

  • A "request" is :

    • the establishment of the connexion between the server and the client (the browser here)
    • The data transfert
    • The connexion close.

    With streaming capacity of HTTP2 and websocket, the connexion can takes minutes (and up to 1 hour) and you can send data through the channel as you want. 1 connexion = 1 request, 5 connexions = 5 requests.

    But keep in mind that keeping this connexion open and processing data in it consume resources on your backend and you can't have dozens of connexion that actively send/receive data, you will saturate your instance.