Search code examples
gobit-manipulationdhtkademlia

Routing Table in Kadmelia K-Bucket algorithm without looping over each bit in the Node ID


Context

I'm trying to implement Kadmelia's K-Bucket algorithm to keep track of closer nodes. I understand in theory how the algorithm works

  1. When a new node is added
  2. If the bucket size hasn't exceeded k (bucket size) we add it to the current bucket
  3. else we split the bucket and divide the contacts in the parent bucket by looping over each bit and split them into two buckets.
  4. This also implies that for a given node there would be k * 8 buckets (or lists)

Question

The question is in reference to the approach adopted in this example http://blog.notdot.net/2009/11/Implementing-a-DHT-in-Go-part-1

Given that we have defined a Node as a byte array of length 20



const IdLength = 20
type NodeID [IdLength]byte

I'm trying to understand what the PrefixLen function does to actually compute the prefix and populate the routing table. I understand what each component of the method does. By that I mean, I understand what the bit shift operators do and AND with 1 to check if the bit is set.

What I fail to understand is the return values and why they're set in that way.



func (node NodeID) PrefixLen() (ret int) {
  for i := 0; i < IdLength; i++ {
    for j := 0; j < 8; j++ {
      if (node[i] >> uint8(7 - j)) & 0x1 != 0 {
        return i * 8 + j;
      }
    }
  }
  return IdLength * 8 - 1;
}

How is the return value suitable to be used as an index for the routing table?


  ...
  prefix_length := contact.id.Xor(table.node.id).PrefixLen();
  bucket := table.buckets[prefix_length];
  ...

How is this approach identical to looping over each bit? How does the author achieve the same result using the PrefixLen method.

Could you help me understand this with examples. Thanks in advance.


Solution

  • How is this approach identical to looping over each bit?

    The loop simply iterates over bytes and then for each byte over the bits doing shifts and masks. So in effect it does iterate over all bits, the bits just happen to be packed into bytes since the smallest addressable unit of memory generally is one byte.

    What I fail to understand is the return values and why they're set in that way.

    It simply calculates the bit position in terms of i complete bytes and j bits in the last partial byte.

    Context

    The context is actually misguided here since you're trying to explain the splittable routing table design while looking at a code sample with a fixed array layout. That is a common source of confusion.