kubernetes google-cloud-platform websocket load-balancing haproxy

What are the options of routing HTTP connections to one specific instance out of many instances behind a load balancer?

Assume there is a system that accepts millions of simultaneous WebSocket connections from client applications. I was wondering if there is a way to route WebSocket connections to a specific instance behind a load balancer (or IP/Domain/etc) if clients provide some form of metadata, such as hash key, instance name, etc.

For instance, let's say each WebSocket client of the above system will always belong to a group (e.g. max group size of 100), and it will attempt to communicate with 99 other clients using the above system as a message gateway.

So the system's responsibility is to relay messages sent from clients in a group to other 99 clients in the same group. Clients won't ever need to communicate with other clients who belong to different groups.

Of course, one way to tackle this problem is to use Pubsub system, such that regardless of which instance clients are connected to, the server can simply publish the message to the Pubsub system with a group identifier and other clients can subscribe to the messages with a group identifier.

However, the Pubsub system can potentially encounter scaling challenges, excessive resource usage (single message getting published to thousands of instances), management overhead, latency increase, cost, and etc.

If it is possible to guarantee that WebSocket clients in a group will all be connected to the instance behind LB, we can skip using the Pubsub system and make things simpler, lower latency, and etc.

Would this be something that is possible to do, and if it isn't, what would be the best option?

(I am using Kubernetes in one of the cloud service providers if that matters.)

Solution

Routing in HTTP is generally based on the hostname and/or URL path. Sometimes to a lesser degree on other headers like cookies. But in this case it would mean that each group should have it's own unique URL.

But that part is easy, what I think you're really asking is "given arbitrary URLs, how can I get consistent routing?" which is much, much more complicated. The base concept is "consistent hashing", you hash the URL and use that to pick which endpoint to talk to. But then how to do deal with adding or removing replicas without scrambling the mapping entirely. That usually means using a hash ring and assigning portions of the hash space to specific replicas. Unfortunately this is the point where off-the-shelf tools aren't enough. These kinds of systems require deep knowledge of your protocol and system specifics so you'll probably need to rig this up yourself.