Search code examples
javabitsetbigint

Java from BigInteger to BitSet and back


In Java 8 the below code converts integer 3 to bitset and prints {0, 1} meaning the bit representation of 3 has ones at positions 0 and 1 that is 11.

System.out.println(BitSet.valueOf(new long[]{3}));

I'm interested in converting a BigInteger or long with large value, say 10000000 into the corresponding BitSet and then back - having the BitSet object (bit representation) convert it to BigInteger (long). I'm also wondering which representation occupies more memory for various values of integer numbers?


Solution

  • You can use the BigInteger.toByteArray() and BitSet.toByteArray() for these:

    BigInteger bi = new BigInteger("31415926535");
    bi = bi.multiply(new BigInteger("271828"));
    System.out.println(bi);
    BitSet bs = BitSet.valueOf(bi.toByteArray());
    System.out.println(bs);
    BigInteger bi2 = new BigInteger(bs.toByteArray());
    System.out.println(bi2);
    

    You can use the constructor of the BigInteger(byte[]) to convert the array of bytes into a BigInteger and the BitSet.valueOf(byte[]) to convert the data into the desired values.

    For the given code, it outputs:

    8539728478155980
    {1, 2, 3, 4, 9, 10, 12, 14, 17, 18, 20, 22, 23, 25, 27, 28, 29, 30, 32, 33, 35, 37, 38, 42, 44, 45, 50, 51, 54, 55}
    8539728478155980
    

    Note that the 2-complement notation is used. Thus for negative numbers, it will generate additional ones. And a 2^64-1 will require more than 64 bits to represent. This also works with big endian. (see modified answer below).

    EDIT: there is however one problem with this approach: if there are tailing zeros, these are not taken into account by the program. This can be important because they are part of the representation. You can solve this problem by adding a tailing bit:

    public static BitSet convertTo (BigInteger bi) {
        byte[] bia = bi.toByteArray();
        int l = bia.length;
        byte[] bsa = new byte[l+1];
        System.arraycopy(bia,0,bsa,0,l);
        bsa[l] = 0x01;
        return BitSet.valueOf(bsa);
    }
    
    public static BigInteger convertFrom (BitSet bs) {
        byte[] bsa = bs.toByteArray();
        int l = bsa.length-0x01;
        byte[] bia = new byte[l];
        System.arraycopy(bsa,0,bia,0,l);
        return new BigInteger(bia);
    }
    

    And call it with:

    BigInteger bi = new BigInteger("20000000");
    System.out.println(bi);
    
    BitSet bs = convertTo(bi);
    System.out.println(bs);
    
    BigInteger bi2 = convertFrom(bs);
    System.out.println(bi2);
    

    About the memory usage

    In general both methods will use approximately the same number of bits. For the first implementation (without size marker and thus errors), it is possible that sometimes, the BitSet approach will use one byte less than the BigInteger approach. With the size marker (second approach), it is the opposite: in general the BitSet will use always one extra byte, except for some rare occasions.