Search code examples
databasenosqldistributed-computingconsistent-hashing

Unavailable nodes in consistent hashing


From everything I have read, in consistent hashing, if a node crashes, the keys handled by that node will be re-mapped to the adjacent node in the hash ring. This conceptually makes sense to me.

What I don't understand is how this would work in practice for a distributed database. How can the data be moved to another node if the node has crashed? Does it assume there is a backup/standby cluster available? Or redundant nodes it can be copied from?


Solution

  • Yes. Data is copied from other nodes in the cluster. If the data is not replicated, there is no way to bring back the data.

    Consistent Hashing gives us a single node to which key is assigned. How are the other nodes on which the key is replicated are identified?

    The answer is replication strategy is built on top of consistent hashing. First, the node to which key belongs is identified using consistent hashing. Second, system replicates the data by using another algorithm. One of the strategies is that the system writes data to the nodes which come next, in a clockwise direction, to the current node in the consistent hash ring. As an example, you can find some other replication strategies here.