Search code examples
pythonsha1bittorrent

Understanding BitTorrent Pieces Output


After using https://github.com/utdemir/bencoder to extract the metainfo from a single file torrent I am seeing the following text under the "pieces" section of the output-

This is an abbreviated portion of the output- 'pieces':'\x8f1g\xdb\x1e\x17\n(\xf9\xbb\xb0&\xa0\xadT9N\xa8L\x89\x97\xf79\x15\x07N

And after looking at https://wiki.theory.org/BitTorrentSpecification I am under the understanding that this output is-

[a] string consisting of the concatenation of all 20-byte SHA1 hash values, one per piece (byte string, i.e. not urlencoded)

However I am seeing the constant backslashes "\" and am wondering if this is something like hexadecimal codes since SHA-1 commonly outputs to hexadecimal?


Solution

  • The output you see from the program is encoded as a Python bytesliteral,
    were non printable bytes and ASCII is escaped.

    \x8f1g\xdb\x1e\x17\n(\xf9\xbb\xb0&\xa0\xadT9N\xa8L\x89\x97\xf79\x15\x07N

    \x8f => hexadecimal 0x8F
    1g => ASCII "1g"
    \xdb\x1e\x17 => hexadecimal 0xDB1E17
    \n => Escape Sequence meaning ASCII Linefeed (LF) (hexadecimal 0x0A)
    ( => ASCII "("
    \xf9\xbb\xb0 => hexadecimal 0xF9BBB0
    etc.