This has caught my interest and attention since yesterday. I am trying to store bits in Java and hit by Memory Overhead.
My first question regarding same is What is size of my Bitset?
Based on the answers I looked at other references and found Memory Usage guide.
Then I looked at BitSet
source code which looks like
public class BitSet implements Cloneable, java.io.Serializable {
/*
* BitSets are packed into arrays of "words." Currently a word is
* a long, which consists of 64 bits, requiring 6 address bits.
* The choice of word size is determined purely by performance concerns.
*/
private final static int ADDRESS_BITS_PER_WORD = 6;
private final static int BITS_PER_WORD = 1 << ADDRESS_BITS_PER_WORD;
private final static int BIT_INDEX_MASK = BITS_PER_WORD - 1;
/* Used to shift left or right for a partial word mask */
private static final long WORD_MASK = 0xffffffffffffffffL;
/**
* @serialField bits long[]
*
* The bits in this BitSet. The ith bit is stored in bits[i/64] at
* bit position i % 64 (where bit position 0 refers to the least
* significant bit and 63 refers to the most significant bit).
*/
private static final ObjectStreamField[] serialPersistentFields = {
new ObjectStreamField("bits", long[].class),
};
/**
* The internal field corresponding to the serialField "bits".
*/
private long[] words;
/**
* The number of words in the logical size of this BitSet.
*/
private transient int wordsInUse = 0;
/**
* Whether the size of "words" is user-specified. If so, we assume
* the user knows what he's doing and try harder to preserve it.
*/
private transient boolean sizeIsSticky = false;
/* use serialVersionUID from JDK 1.0.2 for interoperability */
private static final long serialVersionUID = 7997698588986878753L;
/**
* Given a bit index, return word index containing it.
*/
private static int wordIndex(int bitIndex) {
return bitIndex >> ADDRESS_BITS_PER_WORD;
}
.....
}
As per the calculation based on Memory Guide
, this is what I calculated
8 Bytes: housekeeping space
12 Bytes: 3 ints
8 Bytes: long
12 Bytes: long[]
4 Bytes: transient int // does it count?
1 Byte : transient boolean
3 Bytes: padding
This sums to 45 + 3 bytes (padding to reach multiple of 8)
This means an empty BitSet
itself reserves 48 bytes
.
But my requirement was to store bits, What am I missing? What are my options here?
Thanks much
UPDATE
My requirement is that I want to store total of 64 bits
in two separate fields
class MyClass{
BitSet timeStamp
BitSet id
}
and I want to store millions of MyClass
objects in memory
This sums to 45 + 3 bytes (padding to reach multiple of 8) This means an empty BitSet itself reserves 48 bytes.
First of all I want to advise the right tool to analyze object layout schemes in JVMs - JOL. In your case(java -jar jol-cli/target/jol-cli.jar internals java.util.BitSet
) JOL produces the following result:
Running 64-bit HotSpot VM.
Using compressed references with 3-bit shift.
Objects are 8 bytes aligned.
Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
java.util.BitSet object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) f4 df 9f e0 (11110100 11011111 10011111 11100000) (-526393356)
12 4 int BitSet.wordsInUse 0
16 1 boolean BitSet.sizeIsSticky false
17 3 (alignment/padding gap) N/A
20 4 long[] BitSet.words [0]
Instance size: 24 bytes (reported by Instrumentation API)
Space losses: 3 bytes internal + 0 bytes external = 3 bytes total
Your calculations were not correct because of static fields, thus an empty BitSet
itself reserves 24 bytes. Please note that these calculations are not 100% exact because it was not taken into account size of long[]
object. So the right results are java -jar jol-cli/target/jol-cli.jar externals java.util.BitSet
:
Running 64-bit HotSpot VM.
Using compressed references with 3-bit shift.
Objects are 8 bytes aligned.
Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
java.util.BitSet@6b25f76bd object externals:
ADDRESS SIZE TYPE PATH VALUE
7ae321a48 24 java.util.BitSet (object)
7ae321a60 24 [J .words [0]
This means an empty BitSet itself uses 48 bytes including long array.
In order to optimize memory footprint you can write your own implementation of BitSet
. For example in your use case following options are available:
public class MyOwnBitSet {
long word1;
long word2;
}
public class MyOwnBitSet2 {
long[] word = new long[2];
}
public class MyOwnBitSet3 {
int index;
}
JOL produces the following result:
MyOwnBitSet@443b7951d object externals:
ADDRESS SIZE TYPE PATH VALUE
76ea4c7f8 32 MyOwnBitSet (object)
MyOwnBitSet2@69663380d object externals:
ADDRESS SIZE TYPE PATH VALUE
76ea53800 16 MyOwnBitSet2 (object)
76ea53810 32 [J .word [0, 0]
MyOwnBitSet3@5a2e4553d object externals:
ADDRESS SIZE TYPE PATH VALUE
76ea5c070 16 MyOwnBitSet3 (object)
Let me explain the last example MyOwnBitSet3
. In order to decrease memory footprint you can preallocate a huge array of long
/int
objects and store only pointer on the right cell. With a sufficiently large number of objects, this option is the most advantageous.