Search code examples
bittorrent

How to divide block in piece when they overlap


Some input I'm looking to build a simple minimal bittorrent client. I reading the protocol spec for a 2-3 days now.

here what my understanding on it thus far . Assuming that torrent has a piece length of 26000 bytes and according to non official spec block size is 16384. Something like this.

enter image description here Now upon request of a block of piece message would look like this

piece 0 
block offset 0
block length 16484

So far so good.

Now, for next block which overlap in piece 0 and 1 what should the request look like

piece 0  ## since the start of byte is in piece 0 use piece 0 instead of piece 1
block offset 16384
block length 16384

Now on the receiving end I need to recreate the piece of 26000 bytes so that I can compare that with pieces (hash) to match the piece for correctness.

Is my understanding correct ?

Also I'm let suppose the piece verification failed and may be it because of the first block i.e Block 0 (which is faulty or corrupt) then I should requeue Block 0 and Block 1 (which was valid btw and also a part of piece 1) to retransmit again.

And now suddenly the piece and block distribution become a bit complex then what I assume it be. and I hoping there is a simpler solution to this.

Any thought


Solution

  • Last block in a piece may be smaller than the transfer block size. I.e. 26000 - 16384 = 9616 bytes should be requested in the second PIECE message. As soon as all 26000 bytes have been received, SHA-1 hash should be calculated and compared with the corresponding checksum from the pieces section of metainfo dictionary. If the checksum does not match, you have no means to know which block contained invalid data and should re-download all blocks from this piece.

    My advice would be not to depend on some particular partitioning of the piece, because: 1) peers may use a different transfer block size when requesting data 2) SHA-1 algorithm is block-based, and the digester better use a bigger block size (otherwise calculations will take more time)

    A proper abstraction for a piece would be a generic data range with the following methods:

    • read(from:int, length:int):byte[]
    • write(offset:int, block:byte[]):()

    Then you'll be able to read/write arbitrary subranges of data.