Search code examples
javaarrayshashguavachecksum

Cannot fit file in Java Byte Array


I'm working on Java code that generates checksum for a given file. I am using Gogole's Guava library for hashing. Here is the code -

import com.google.common.hash.HashCode;
import com.google.common.hash.HashFunction;
import com.google.common.hash.Hashing;

private HashCode doHash(File file) throws IOException {
    HashFunction hc = Hashing.murmur3_128();
    HashCode hsCode = hc.newHasher().putBytes(com.google.common.io.Files.asByteSource(file).read()).hash();
    return hsCode;
}

I ran this code for a file that was 2.8GB in size. It threw the following error -

Exception in thread "main" java.lang.OutOfMemoryError: 2945332859 bytes is too large to fit in a byte array
    at com.google.common.io.ByteStreams.toByteArray(ByteStreams.java:232)
    at com.google.common.io.Files$FileByteSource.read(Files.java:154)
    ...

Is there another data structure that I can use here? Or should I look for another strategy to feed the file to the hash function?


Solution

  • Guava's HashFunctions don't know how to deal with ByteSources. But ByteSources know how to deal with HashFunctions. Just do it that way.

    HashCode hsCode = Files.asByteSource(file).hash(hc);