Search code examples
hbasehbase-shell

What are the non-hex characters in HBase Shell RowKey?


I am saving my key as a byte-array. In HBase Shell when I look at my key I see non-hex values...I do not have any encoding enabled, I do not have any compression enabled.

Here is a sample...what is VNQ? what is BBW? I'm guessing there is some sort of encoding going on?

\xFB\xC6\xE8\x03\xF0VNQ\x8By\xF6\x89D\xC1\xBBW\x00\x00\x00\x00\x00\x00\x01\xF3\x00\x00\x00\x00\x00\x07\xA1\x1F

Solution

  • HBase shell uses something called a "binary string" (Escaped hexadecimal) representation of byte arrays to print out the keys/values (See Bytes.toStringBinary method). This method basically does one of the two things to every byte:

    1. Convert it to a printable (ASCII) representation if the byte value is within range.
    2. Convert it to \xHH (where 'H' represents a Hex digit) if the byte value is not within the ASCII range.

    The idea is to use a printable representation. If your keys/values were all printable characters, then the shell would not print out any of those weird \xHH sequences.

    If you prefer Hex representation instead, try the following in HBase shell:

    > import org.apache.hadoop.hbase.util.Bytes
    > Bytes.toHex(Bytes.toBytesBinary("\xFB\xC6\xE8\x03\xF0VNQ"))
    > fbc6e803f0564e51
    

    You can modify hbase shell ruby wrappers to use the toHex() method instead of the toStringBinary() to print out data (or better; you can contribute a patch to HBase to include a flag for the two choices if you feel like it; see HBase developer guide).