Search code examples
bittorrentdhtkademlia

Can somebody shed a light what this strange DHT response means?


Sometimes I receive this strange responses from other nodes. Transaction id match to my request transaction id as well as the remote IP so I tend to believe that node responded with this but it looks like sort of a mix of response and request

d1:q9:find_node1:rd2:id20:.éV0özý.?tj­N.?.!2:ip4:DÄ.^7:nodes.v26:.ï?M.:iSµLW.Ðä¸úzDÄ.^æCe1:t2:..1:y1:re

Worst of all is that it is malformed. Look at 7:nodes.v it means that I add nodes.v to the dictionary. It is supposed to be 5:nodes. So, I'm lost. What is it?


Solution

  • The internet and remote nodes is unreliable or buggy. You have to code defensively. Do not assume that everything you receive will be valid.

    Remote peers might

    • send invalid bencoding, discard those, don't even try to recover.
    • send truncated messages. usually not recoverable unless it happens to be the very last e of the root dictionary.
    • omit mandatory keys. you can either ignore those messages or return an error message
    • contain corrupted data
    • include unknown keys beyond the mandatory ones. this is not an error, just treat them as if they weren't there for the sake of forward-compatibility
    • actually be attackers trying to fuzz your implementation or use you as DoS amplifier

    I also suspect that some really shoddy implementations are based on whatever string types their programming language supports and incorrectly handle encoding instead of using arrays of uint8 as bencoding demands. There's nothing that can be done about those. Ignore or occasionally send an error message.

    Specified dictionary keys are usually ASCII-mappable, but this is not a requirement. E.g. there are some tracker response types that actually use random binary data as dictionary keys.


    Here are a few examples of junk I'm seeing[1] that even fails bdecoding:

    d1:ad2:id20:�w)��-��t����=?�������i�&�i!94h�#7U���P�)�x��f��YMlE���p:q9Q�etjy��r7�:t�5�����N��H�|1�S�
    d1:e�����������������H# 
    d1:ad2:id20:�����:��m�e��2~�����9>inm�_hash20:X�j�D��nY��-������X�6:noseedi1ee1:q9:get_peers1:t2:�=1:v4:LT��1:y1:qe
    d1:ad2:id20:�����:��m�e��2~�����9=inl�_hash20:X�j�D��nY���������X�6:noseedi1ee1:q9:get_peers1:t2:�=1:v4:LT��1:y1:qe
    d1:ad2:id20:�����:��m�e��2~�����9?ino�_hash20:X�j�D��nY���������X�6:noseedi1ee1:q9:get_peers1:t2:�=1:v4:LT��1:y1:qe
    

    [1] preserved char count. replaced all non-printable, ASCII-incompatible bytes with the unicode replacement character.