my code is :
String blah = "blah";
byte[] blahBytes = blah.getBytes("US-ASCII");
System.out.println(Arrays.toString(blahBytes));
BitSet set = BitSet.valueOf(blahBytes);
System.out.println(set.length());
the output is :
[98, 108, 97, 104]
31
Why is length()
returning 31? Shouldn't it be 32?
Bit set length is determined by the position of the highest bit set to 1
. Since all bytes that you pass to construct bit set represent ASCII character subset of UNICODE, the 8-th bit is always zero. Therefore, the highest bit set to 1
will be either bit 30 or bit 31, depending on the letter or digit in the end of your string: if you pass "bla1"
instead of "blah"
you would get 30 (demo 1). If you use control characters, such as <TAB>
you could get an even shorter bit set of 28 (demo 2).
If you would like to get a length rounded up to the next multiple of 8, use
int roundedLength = 8 * ((set.length() + 7) / 8);