Search code examples
storagedistributedp2pkademlia

A "durable" Kademlia network?


A while ago I played around with the Kademlia (KAD) protocol. I understood how it worked and I got the idea that one might use it to create a distributed data store.

Anyway, there is one problem: In Kademlia for each data package there is a node that "owns" it. When the data is requested, it gets propagated to the next node, but is assigned a TTL. After that it is being deleted. The idea in Kademlia is that the "owner" node refreshes the data on the other nodes before the data expire there.

As far as I understood this leads to caching the data even if the "owner" node leaves the network - but only for a while. If the owner node never comes back, all the data that was copied from it to the other nodes will expire sooner or later, hence after a while the data will be gone.

While this is okay for a P2P network where people want to share files, it would not be so very fine for a distributed data store.

How could one deal with this?

Or - is there another P2P protocol similar to Kademlia which takes this in consideration? In my imagination, the "perfect" solution would be if there were always a number of N nodes which hold the replicated data. As soon as one of them leaves, the remaining N-1 nodes look for another one to push the data to, so that you again have N nodes.

Does such a protocol exist?


Solution

  • Are you interested in developing your own implementation of the protocol or use an existing solution?

    If you want to play around with your own implementation I would suggest looking at Chord DHT which I think is good.