Search code examples
javabyte

Convert Raw Bytes to String


i have this issue on my java code and i'm kinda stuck with my solution, so i want to convert a raw bytes that i captured from a network sniffer tool (wireshark) to a string real value of the bytes

basically this was the solution i want to create but i don't have an idea because i can't type cast it

String array= "6549534f3030363030303036303038303038323230303030303030303030303030303430303030303030303030303030303034313830373432313730303030303130303103";
        char[] ch=array.toCharArray();
        String charTemp = "";
        byte[] byteArray= {};

        for (int i = 0; i < ch.length; i++) {
            if(i%2==0) {
                System.out.println(charTemp);
                System.out.println(byteArray.length);
                byteArray[byteArray.length]=(int) charTemp;
                charTemp="";
            }
            charTemp+=ch[i];
            System.out.print(ch[i] + " ");
        }

actually i want this link but on my java code because i want to learn how to byte manipulation, also maybe if someone have a reference/guide how to learn about byte manipulation, please recommend me :) Thanks


Solution

  • There are at least 4 issues here, and on top of that, this isn't a particularly efficient algorithm (not that this is likely to matter).

    conflating ASCII value with digit

    The character 6 actually is value 54. No, really:

    char c = '6';
    int v = (int) c;
    System.out.println(v); // this would print 54.
    

    After all, a character could be 'A' to. What 'number' should that become? In computer systems, any character is looked up in a gigantic table that matches a symbol (such as a, or 5, or space, or enter, or ç, or é, or ☃, or even 😀) to a number.

    And the CHARACTER (or symbol, if you prefer that terminology) '6', in that table, is associated with code 54.

    Trying to cast a string to an int

    charTemp is a string and will end up containing e.g. "4f". That's 2 numbers. Java correctly identifies that trying to cast this to an int doesn't make sense. (even if a string only contains 1 character, java does that - the compiler knows you can't convert strings to an int by casting, it doesn't bother checking if in this case it might possibly make some semblance of sense. It COULD make no sense, so you can't do it).

    What you want is to convert the string "4f" into the value of it. "4f" is itself just a number in string form, just like "100" would be. Except, it's in base 16, instead of base 10. Still, a number though.

    So, how do we turn numbers-in-strings into their values? With Integer.parseInt, of course. Which has a variant where you specify the base. So:

    int v = Integer.parseInt("9", 16); // 16 = the base
    System.out.println(v); // prints 9
    v = Integer.parseInt("f", 16);
    System.out.println(v); // prints 15. That's nice.
    v = Integer.parseInt("4f", 16);
    System.out.println(v); // Prints out 79. Which is what you wanted, I guess.
    

    bytes and ints

    You can't assign an int to a value in a byte array - you are casting to int. Once you fix everything above, you still have to replace that with (byte).

    You erroneously think arrays can grow in java

    Your byte array is empty. Arrays in java are fixed size, so when you make it, it is empty, and it cannot be grown, so, it's stuck there. x[x.length] will ALWAYS cause an ArrayIndexOutOfBoundsException (because an array's length is the first index that DOES NOT work). After all, an array of, say, length 3 has 3 indices: 0, 1, and 2. 3, itself, isn't valid. Thus, byteArray[0] = ... would crash.

    One solution is to pre-size the byte array appropriately: byte[] byteArray = new byte[array.length() / 2];

    And you would have to use byteArray[i / 2] = Integer.parseInt(charTemp, 16);

    All this code is probably not needed

    There are tools and libraries that read byte arrays formatted as hexadecimal strings (a.k.a. nibbled strings) in for you, you don't have to write this code. See this SO answer for more - because that's what you're trying to (presumably) do here.

    An explanation of Hexadecimal counting

    Start counting, like we're playing hide-and-seek. 1, 2, 3, 4,...

    What happens after 8? We get to 9. Easy enough.

    Why do we go to '10' - 2 separate symbols, when we need to count beyond 9?

    Who knows - presumably because us humans have 10 fingers. Actually, in the times of the roman empire, but outside of its borders, 12 was much more common. Take out your thumb, and start counting finger segments. You have 3 on each of the 4 remaining fingers - it's an easy way to keep track of counting to 12.

    Those nordic folks had symbols for 12 digits (0 up to 11). Just like you're used to having 10 digits (0, up to 9). When you 'run out of digits', you just add a 'place' - we start counting how many times we used all the digits. "24" simply means we ran out of digits twice (so that's 10, each time, so 20), and then 4 more.

    Computers like to count in binary. 0, 1... and we ran out of digits. So, computers go 0, 1, 10, 11, 100, 101, and so on. But this gets unwieldy very quickly and hard to 'track' as a human.

    So, we compromise and use hexadecimal instead. This is counting as if you had 16 fingers. You go 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f. And then we ran out of digits, so the number after that (so, 16, in decimal!) - is written as '10'. Just like when those old vikings got to writing '1' and '0' when they start counting the 12th thing, with hex you get to '1' and '0' when you get to the 16th thing.

    And that is what you are trying to parse in here. Note how your input contains "4f" for example. That 'f' isn't a letter. It's the '9' (the last digit) of the hexadecimal system.