Search code examples
javaencodinghexbyte-shifting

Creating a ISO-8859-1 string from a HEX-string in Java, shifting bits


I am trying to convert a HEX-sequence to a String encoded in either, ISO-8859-1, UTF-8 or UTF-16BE. That is, I have a String looking like: "0422043504410442" this represents the characters: "Test" in UTF-16BE.

The code I used to convert between the two formats was:

private static String hex2String(String hex, String encoding) throws UnsupportedEncodingException {
    char[] hexArray = hex.toCharArray();

    int length = hex.length() / 2;
    byte[] rawData = new byte[length];
    for(int i=0; i<length; i++){
        int high = Character.digit(hexArray[i*2], 16);
        int low = Character.digit(hexArray[i*2+1], 16);
        int value = (high << 4) | low;
        if( value > 127)
                value -= 256;
        rawData[i] = (byte) value;
    }
    return new String(rawData, encoding);
}

This seems to work fine for me, but I still have two questions regarding this:

  1. Is there any simpler way (preferably without bit-handling) to do this conversion?
  2. How am I to interpret the line: int value = (high << 4) | low;?

I am familiar with the basics of bit-handling, though not at all with the Java syntax. I believe the first part shift all bits to the left by 4 steps. Though the rest I don't understand and why it would be helpful in this certain situation.

I apologize for any confusion in my question, please let me know if I should clarify anything. Thank you. //Abeansits


Solution

  • Is there any simpler way (preferably without bit-handling) to do this conversion?

    None I would know of - the only simplification seems to parse the whole byte at once rather than parsing digit by digit (e.g. using int value = Integer.parseInt(hex.substring(i * 2, i * 2 + 2), 16);)

    public static byte[] hexToBytes(final String hex) {
      final byte[] bytes = new byte[hex.length() / 2];
      for (int i = 0; i < bytes.length; i++) {
        bytes[i] = (byte) Integer.parseInt(hex.substring(i * 2, i * 2 + 2), 16);
      }
      return bytes;
    }
    

    How am I to interpret the line: int value = (high << 4) | low;?

    look at this example for your last two digits (42):

    int high = 4; // binary 0100
    int low = 2; // binary 0010
    int value = (high << 4) | low;
    
    int value = (0100 << 4) | 0010; // shift 4 to left
    int value = 01000000 | 0010; // bitwise or
    int value = 01000010;
    int value = 66; // 01000010 == 0x42 == 66