I came across this question in cracking the coding interview book.
Question: Given an input file with four billion non-negative integers, provide an algorithm to generate an integer which is not contained in the file. Assume you have 1 GB of memory available for this task.
My question: Why can't we use BitSet instead of Byte[] ? Won't that simplify things ?
Code:
long numberOflnts = ((long) Integer.MAX_VALUE) + I;
byte[] bitfield = new byte [(int) (numberOflnts / 8)];
void findOpenNumberQ throws FileNotFoundException {
Scanner in = new Scanner(new FileReader("file.txt"));
while (in.hasNextlntQ) {
int n = in.nextlnt ();
/* Finds the corresponding number in the bitfield by using
* the OR operator to set the nth bit of a byte
* (e.g., 10 would correspond to the 2nd bit of index 2 in
* the byte array). */
bitfield [n / 8] |= 1 « (n % 8);
}
for (int i = 0; i < bitfield.length; i++) {
for (int j = 0; j < 8; j++) {
/* Retrieves the individual bits of each byte. When 0 bit
* is found, finds the corresponding value. */
if ((bitfield[i] & (1 « j)) == 0) {
System.out.println (i * 8 + j);
return;
}
}
}
}
Follow up: What if you have only 10 MB of memory? Assume that all the values are distinct.
The question does allow for alternative solutions. Java's BitSet
can work but there are a couple of hidden traps:
BitSet
is backed by an array. Java arrays use 32bit signed int as indexes, so you effectively have 2^31 entries. Since each is a 64bit long, that is enough.