Implementation of fair DHT

I'm thinking of an infrastructure where many users connect to one server, and store key-value pairs using a hash.

Many such servers exist, each storing the key-value pairs for their own users. We assume keys don't clash.

A user U1 on server S1, might look up an object with key K2, which lives on server S2 (the user does not know this yet). We need some sort of distributed hash table to map a key to a server_addr, so we can then query that server for the object.

There are many such DHTs, such as Tapesry, Chord, etc. I have been thinking of how to make a system like this fair.

For example in a system with three servers, a server S1 may have 1000 users, S2 has 2 users and S3 has 5 users. If we assume the users create 10 objects each and we distribute the key-space uniformly, server S2 and S3 will store about 3500 keys each, which is an order of magnitude or two larger than their own consumption of keys.

I want a way for S1 to take responsibility for its fair share of keys in the DHT.

One idea I had is something like an auditing system, where each peer asks everyone else how many keys they are storing in the DHT, and to then check they are also responsible for that fraction of key space.

However, this leads to a large bandwidth usage to get the consumption of each node.

Any other ideas?

Solution

There are several possible approaches

Do nothing

In the wild DHTs are not perfectly homogeneous environments. Some nodes have more resources than others (embedded device vs. fat server). Some nodes create more activity than others.

A node may simply render services (routing, storage) according to their ability and refuse requests (either by dropping them or returning negative responses) once their capacity is reached.

Nodes issuing the requests will simply treat them like failures and route around the failure points.

You basically should check if the case where nodes consume several orders of magnitude more in resources than others is common enough to warrant any balancing.

Voluntary Measures

A node that causes more traffic may simply be designed to provide more resources. E.g. it can run multiple virtual nodes scattered throughout the keyspace and thus offer storage and routing for more keys.

This should especially easy for server-class machines with high uptimes, bandwidth and low latency.

Enforcement

This is where it gets tricky. In a distributed system you have no trust or regulating authority. A node would have to prove that it provides adequate services before you offer to service its request.

The first obvious measure would be other nodes vouching that it did indeed provide the services it claims. But that only provides evidence that it provides some services, it doesn't say anything about the ratio between offered and consumed resources. And you would also need a mechanism to verify that it indeed does store the data it claims, not just returning positive responses and then discarding them.

So you would need accounting, verification and a web of trust because 1-hop vouchers may be inadequate.

As you can see complexity quickly explodes.

You probably should look at the bigger picture and determine incentives that attackers and good citizens of your network would have.

what is there to gain from consuming excessive amounts of resources
what are the costs of verification (complexity, human effort, traffic overhead relative to malicious traffic prevented)
how much of a burden would outliers really cause?

etc.