Search code examples
network-programmingdistributed-computingdistributed

What's an elegant/scalable way to dispatch hash queries, whose key is a string, to multiple machines?


I want to make it scalable. Suppose letters are all in lower case. For example, if I only have two machines, queries whose first character is within a ~ m can be dispatched to the first machine, while the n ~ z queries can be dispatched to the second machine.

However, when the third machine comes, to make the queries spread as even as possible, I have to re-calculate the rules and re-distribute the contents stored in the previous two machines. I feel it could be messy. For example, the more complex case, when I already have 26 machines, what should I do when the 27th one comes? What do people usually do to achieve the scalability here?


Solution

  • The process of (self-) organizing machines in a DHT to split the load of handling queries to a pool of objects is called Consistent Hashing: https://en.wikipedia.org/wiki/Consistent_hashing

    I don't think there's a definitive answer to your question.

    First is the question of balance. The DHT is balanced when:

    • each node is under similar load? (load balancing is probably what you're after)
    • each node is responsible for similar amounts of objects? (this is what you seemed to suggest)
    • (less likely) each node is responsible for similar amount of the addressing space?

    I believe your objective is to make sure none of the machines is overloaded. Unless queries to a single object are enough to saturate a single machine, this is unlikely to happen if you rebalance properly.

    If one of the machines is under significantly lower load than the other, you can make the less-load machine take over some of the objects of the higher-load machine by shifting their positions in the ring.

    Another way of rebalancing is through virtual nodes -- each machine can simulate being k machines. If its load is low, it can increase the amount of virtual nodes (and take over more more objects). If its load is high, it can remove some of its virtual nodes.