Search code examples
architecturescalabilitytopology

Scaling of push events - Best topology?


I've built a TCP server which handles RPC (request/reply) type requests from clients, but it also allows services to push events down at ad-hoc times.

If I need to scale in the future, the RPC stuff is quite easy, like web infrastructure, I'll just add more nodes and load-balance.

To scale the push messages, I will need all the servers to coordinate as the client(s) subscribed to the events could be on any server.

My options are:

  1. broadcast the events to all the servers using UDP multicast/broadcast (e.g. emcaster)
  2. fully interconnect the servers to each other using TCP
  3. central server where all the events are sent, and all the worker servers connect to that one
  4. [3] but with several layers to form a tree

My temptation is to go with [1] as it is simple and probably works well for up to 20-30 nodes. Is there a consensus on what the best strategies are for different ranges of N, where N is the number of nodes?


Solution

  • Its hard to advise which would be the best strategy without knowing more details. Perhaps what might help would be to list some things to consider for each item:

    1. UDP Broadcast

      • As you mention, this will be the easiest to implement.
      • Why is the limit 20-30 nodes? Will that limit work with your requirements? If so, go with it.
      • Will the UDP broadcast messages possibly be affected by NW elements such as firewals?
    2. Interconnected TCP NW

      • This option seems like it could be a maintenance nightmare to configure and maintain a consistent list of IP addresses.
      • How will a particular server know which is the next server to send the message to? This logic could become complex.
    3. Central Server

      • Personally, I would consider this to be the second possible solution after [1.]
      • This central server may need some quite complex processing to know where to send the messages.
    4. Central Server with a tree

      • Configuration and Maintenance nightmare
      • The complex logic mentioned in 4 will be even worse with this solution.

    Personally, I would look at the pros and cons of each and also consider how each solution addresses the requirements. Hopefully that lesson will make the decision easier.