Search code examples
javaurlencodebittorrent

Torrent, query URL encoding using java


I am writing my torrent client and stuck at the moment when I need to send a GET request with info hash. When sending a request, I need to format it correctly. As it turned out, URLEncode does not match its name, other ways I know do not lead me to the target. (Sorry for the bad English)

I try to do it without using third-party libraries.

As I have seen, I need "Conversion from hexadecimal representation to the bytestring value of the hash." but my attempts to do so do not give the desired result.

I found these answers and a few others, but they were all on other programming language I could not understand and reproduce them in my code. link vb.net link rust

I also found the Bittorent library but even using its encoding method, nothing happened to my program.


UPD 1: info hash that i get when bencoding: 0a85522a2f09e42f3d63a89a0d45e4589f8b904c

Here's what I see in Wireshark:

https://bt.toloka.to/announce/h=IT5FwgeUF1& (Tracker blocks most countries so if you want to check, use VPN (recomend Netherlands))
&info_hash=%0A%85R%2A%2F%09%E4%2F%3Dc%A8%9A%0DE%E4X%9F%8B%90L
&peer_id=-UT360W-%FE%B5%95%1A%88%0A%DF%97K%E9%FD%23
&port=19708
&uploaded=0
&downloaded=0
&left=16421367202
&corrupt=0
&key=A36E3AE9
&event=stopped
&numwant=0
&compact=1
&no_peer_id=1

It encodes info hash as follows: %0A%85R%2A%2F%09%E4%2F%3Dc%A8%9A%0DE%E4X%9F%8B%90L


UPD 2:

My problem is that I can't implement URL encoding. I need to convert here this:

0a85522a2f09e42f3d63a89a0d45e4589f8b904c

Into this:

%0a%85R%2a%2f%09%e4%2f%3dc%a8%9a%0dE%e4X%9f%8b%90L

I tried to rewrite the code from other answers that are on stackoverflow, but I did not succeed in anything sensible.

        String a = "0a85522a2f09e42f3d63a89a0d45e4589f8b904c";
        
        byte[] hash = a.getBytes(StandardCharsets.UTF_8);
        StringBuilder res = new StringBuilder();

        for(char element : a.toCharArray()){

            if(Character.getNumericValue(element) <= 127){
                char[] result = URLEncoder.encode(String.valueOf(element), String.valueOf(StandardCharsets.UTF_8)).toCharArray();

                if(result[0] == '%'){
                    res.append(toLowerCase(result));
                }else{

                    char[] reinfo = new char[result.length + 1];
                    reinfo[0] = '%';

                    for(int i = 0; i < result.length; i++){
                        reinfo[i + 1] = result[i];
                    }

                    res.append(toLowerCase(reinfo));
                }
            }
        }


Solution

  • Update 1:

    Okay, I realized another problem, URLEncoder only allows us to encode our hash if we encode our byte array in ISO8859_1. If we encode our array in UTF-8 or ACSII, URLEncoder cannot encode it correctly. Therefore, if we have the original byte array, we can write the following:

    URLEncoder.encode(new String( <your byte array> , "ISO8859_1"), "ISO8859_1");
    

    Input data:

    array = [10, -123, 82, 42, 47, 9, -28, 47, 61, 99, -88, -102, 13, 69, -28, 88, -97, -117, -112, 76].

    result:

    %0a%85R%2a%2f%09%e4%2f%3dc%a8%9a%0dE%e4X%9f%8b%90L

    If we already get a converted byte array in a string, then we will have to manually describe the encoding logic, there are several ways, one of them I will write below, and if you need a more compact code, then you can use "for" or "while", an example of how to do this is here


    Primitive implementation of url encoding:

    I did manage to do it in java. As I noticed, only a-z, A-Z, 0-9 characters are used in hash_info, so all you have to do is convert the HEX to the corresponding letter. (there is necessary information here)

    Here's the code I got:

    
        public static String encodeURL(char[] element){
            StringBuilder result = new StringBuilder();
    
            for(int i = 0; i < element.length; i++){
                result.append(encode( String.valueOf(element[i++]) + element[i]));
            }
    
            return result.toString();
        }
    
    
        private static String encode(String sumChar){
    
            switch (sumChar){
                case "41": return "A";
                case "42": return "B";
                case "43": return "C";
                case "44": return "D";
                case "45": return "E";
                case "46": return "F";
                case "47": return "G";
                case "48": return "H";
                case "49": return "I";
                case "4A":
                case "4a": return "J";
                case "4B":
                case "4b": return "K";
                case "4C":
                case "4c": return "L";
                case "4D":
                case "4d": return "M";
                case "4E":
                case "4e": return "N";
                case "4F":
                case "4f": return "O";
                case "50": return "P";
                case "51": return "Q";
                case "52": return "R";
                case "53": return "S";
                case "54": return "T";
                case "55": return "U";
                case "56": return "V";
                case "57": return "W";
                case "58": return "X";
                case "59": return "Y";
                case "5A":
                case "5a": return "Z";
                case "61": return "a";
                case "62": return "b";
                case "63": return "c";
                case "64": return "d";
                case "65": return "e";
                case "66": return "f";
                case "67": return "g";
                case "68": return "h";
                case "69": return "i";
                case "6A":
                case "6a": return "j";
                case "6B":
                case "6b": return "k";
                case "6C":
                case "6c": return "l";
                case "6D":
                case "6d": return "m";
                case "6E":
                case "6e": return "n";
                case "6F":
                case "6f": return "o";
                case "70": return "p";
                case "71": return "q";
                case "72": return "r";
                case "73": return "s";
                case "74": return "t";
                case "75": return "u";
                case "76": return "v";
                case "77": return "w";
                case "78": return "x";
                case "79": return "y";
                case "7A":
                case "7a": return "z";
    
                default: return "%" + sumChar;
            }
        }
    
    
    
    

    Input data:

    0a85522a2f09e42f3d63a89a0d45e4589f8b904c

    Outcome:

    %0a%85R%2a%2f%09%e4%2f%3dc%a8%9a%0dE%e4X%9f%8b%90L

    Thank you for the suggestions and advice to those who wanted to help.