Search code examples
bittorrentdhtkademlia

What are the most recent bittorrent DHT implementation recommendations?


I'm working on implementing yet another bittorrent client and at this time struggling with DHT. It is implemented accordingly to this specification http://www.bittorrent.org/beps/bep_0005.html but starting debugging it I noticed that other nodes' responses on the network vary.

For example, find_node is supposed to return either target node info or 8 closest nodes. Most of the nodes reply with 34 closest nodes and usually only 1 - 3 nodes from those 34 successfully reply to the consequent ping request.

Is there another document with better implementation recommendation? May be it is already proved that using 15 minutes interval to change the nodes state to questionable is not efficient and I have to use 10 or other number? Where can I find the best up to date suggestions?

There is another strange thing. Bootstrap nodes like router.bittorrent.com reply with even more closest nodes and usually the "nodes" BDictionary property buffer length is not divisible to 6 (compact node info: 4 for IP and 2 for port). For now, I simply cut off the buffer at the closest divisible to 6 length but all that is strange. Does anybody know why that might happen?


Solution

  • the spec says (emphasis mine):

    When a node receives a find_node query, it should respond with a key "nodes" and value of a string containing the compact node info for [...]

    Further down:

    Contact information for nodes is encoded as a 26-byte string. Also known as "Compact node info" the 20-byte Node ID in network byte order has the compact IP-address/port info concatenated to the end.


    Additionally you should read the original Kademlia paper since the bittorrent BEP builds on the concepts described therein and omits deeper explanations of those concepts.

    You might also want to read for a few few extensions that are more or less de-facto standard for most implementations now http://libtorrent.org/dht_extensions.html

    And read the other DHT-related BEPs, some are fairly widely adopted and modify/clarify BEP-5-specified behavior, but generally in a backward-compatible way.


    For example, find_node is supposed to return either target node info or 8 closest nodes

    Nodes will return a variable amount of entries. Could be more than 8. Or fewer.