Search code examples
javabittorrentbencoding

Why is hashing the info dict turning out wrong?


I have been trying for ages to get this hashing thing for BitTorrent to work in Java but it always becomes wrong.

I have narrowed it down to a few lines of code where I'm 99% sure the problem is:

Bencode bencode = new Bencode(Charset.forName("UTF-8"));
byte[] fileBytes = new byte[33237];
Map<String, Object> dict = bencode.decode(fileBytes, Type.DICTIONARY);
Map infoMap = (Map) object.get("info");
ByteArrayOutputStream baos = new ByteArrayOutputStream();
BencodeOutputStream bos = new BencodeOutputStream(baos);
bos.writeDictionary(infoMap);
byte[] hash = DigestUtils.sha1(baos.toByteArray());

I have hardcoded the size of the array just to make sure the issue is not caused by a bunch of zeroes hanging around.

I have tried with both UTF-8 and US-ASCII.

I have tried using two different libraries for the bencoding so it's probably not there where the problem's at.

Edit: From the spec it seems that the info dict should be urlencoded as the info_hash. So I tried writing out the dictionary into a ByteArrayOutputStream and then do the sha1 hashing on the byte[] that ByteArrayOutPutStream is holding.

Will the DigestUtils.sha1method provide a URL encoder? Can't find any information on that.


Solution

  • The problem, as Encombe pointed out, was with the encoding. In the Bencode specification it talks about byte strings and this seems to point to it just being a stream of data without any encoding.

    Both of the libraries I looked at converted all byte strings to some encoding so I wrote a Bencode library that only did the conversion when specifically asked to.

    The code above is basically correct but here is the client code I am using now:

    public void readManifest() throws IOException, Exception {
        byte[] fileBytes = FileUtils.readFileToByteArray(file);
        ByteArrayInputStream bis = new ByteArrayInputStream(fileBytes);
        BDecoder decoder = new BDecoder(bis, "UTF-8");
        BDict dict = decoder.decodeDict();
        Map<String, Object> valueMap = dict.getValue();
        infoMap = (Map<String, Object>) valueMap.get("info");
    }
    
    public String hash() throws Exception {
        if (hash == null) {
            ByteArrayOutputStream baos = new ByteArrayOutputStream();
            BEncoder encoder = new BEncoder(baos, "UTF-8");
            encoder.encodeDict(infoMap);
            hash = DigestUtils.sha1Hex(baos.toByteArray());
        }
        return hash;
    }