Search code examples
phpcbittorrenttorrent

How can I decode binary "pieces root" key in info metadata of a .torrent file in new BitTorrent v2?


In BitTorrent v2 there's pieces root key (string) which has root sha256 of a file encoded in binary form, in documentation there's written:

"pieces root" is the the root hash of a merkle tree with a branching factor of 2, constructed from 16KiB blocks of the file. The last block may be shorter than 16KiB. The remaining leaf hashes beyond the end of the file required to construct upper layers of the merkle tree are set to zero. As of meta version 2 SHA2-256 is used as digest function for the merkle tree. The hash is stored in its binary form, not as human-readable string.

I need to extract this hash to use it on my torrent tracker, so in info web page users could see original hashes of files of torrent, how do I do that? How could I decode that binary string and I don't know if those are concatenation of all piece hashes.

PHP or C is preferred or maybe some docs. I'm a noob regarding encoding, so please explain thoroughly. Thanks a ton!!

I tried unpack() function, but I'm missing something.


Solution

  • The hash as stored in the torrent file is not encoded, it's in its native representation that computers deal in: a sequence of bytes. In the case of SHA2-256 that would be 32 bytes (256 bits).

    If you need it representable in text then you'll have to encode it. There are many ways to do this. Hexadecimal is a common choice, also frequently used to display the infohash of a torrent.

    I don't know if those are concatenation of all piece hashes.

    As the BEP says, the pieces root is the root hash of a merkle tree, it can't be obtained by concatenation of individual block hashes.

    It can only be computed from the torrent contents themselves. So if you don't have the data you can't recompute it, you can only extract it from the torrent file. But since it uses a fixed construction (independent of the piece size) the pieces root is always the same for files of equal content.