working on bep44 implementation, i use the defined kademlia algorithm to find the closest good node given an hash id.
Using my program i do go run main.go -put "Hello World!" -kname mykey -salt foobar2 -b public
and get the value stored over a hundred nodes (good).
Now, when i run it multiple consecutive times, the sets of ip which are written by the put requests poorly intersects.
It is a problem as when i try to do a get request, the set of ips queried does not intersect with the put set, so the value is not found.
In my tests i use the public dht bootstrap node
"router.utorrent.com:6881",
"router.bittorrent.com:6881",
"dht.transmissionbt.com:6881",
When i query the nodes, I select the 8 closest nodes (nodes := s.ClosestGoodNodes(8, msg.InfoHash())
), which usually end up in a list of ~1K queries after a recursive traversal.
In my understanding, storing addresses of the info hash in the dht table is deterministic given the status of the table. As i m doing consecutive queries i expect the table to change, indeed, but not that much.
How does it happen the store nodes set does not intersect ?
Since BEP44 is an extension it is only supported by a subset of the DHT nodes, which means the iterative lookup mechanism needs to take support into account when determining whether the set of closest nodes is stable and the lookup can be terminated.
If a node returns a token
, v
or seq
field in in a get response then it is eligible for the closest-set of a read-only get.
If a node returns a token
then it is eligible for the closest-set for a get that will be followed by put operation.
So your lookup may home in on a set of nodes in the keyspace that is closest to the target ID but not eligible for the operations in question. As long as you have candidates that are closer than the best known eligible contacts you have to continue searching. I call this perimeter widening, as it conceptually broadens the search area around the target.
Additionally you also need to take error responses or the absence of a response into account when performing put
requests. You can either retry the node or try the next eligible node instead.
I have written down some additional constraints that one might want to put on the closest set in lookups for robustness and security reasons in the documentation of my own DHT implementation.
which usually end up in a list of ~1K queries after a recursive traversal.
This suggests something is wrong with your lookup algorithm. In my experience a lookup should only take somewhere between 60 and 200 udp requests to find its target if you're doing a lookup with concurrent requests, maybe even fewer when it is sequential.
Verbose logs of the terminal sets to eyeball how the lookups make progress and how much junk I am getting from peers have served me well.
In my tests i use the public dht bootstrap node
You should write your routing table to disk and reload it from there and only perform bootstrapping when none of the persisted nodes in your RT are reachable. Otherwise you are wasting the bootstrap nodes' resources and also waste time by having to re-populate your routing table first before performing any lookup.