Search code examples
javaout-of-memorychunks

Java heap space error when I trying to split a large file in chunks


I am trying to split a large file (>20GB) in chunks to upload them.

This is my method:

   public List<byte[]> chunkFile() throws IOException {
        File file = new File(backupPath);

        List<byte[]> chunks = new ArrayList<>();

        try (FileInputStream fis = new FileInputStream(file)) {
            byte[] buffer = new byte[1024 * 1024 * 10];

            int bytesRead;

            while ((bytesRead = fis.read(buffer)) > 0) {
                byte[] chunk = new byte[bytesRead];
                System.arraycopy(buffer, 0, chunk, 0, bytesRead);
                chunks.add(chunk);
            }
            return chunks;
        }
    }

It works well to small files (500mb), but when I try with a larger file I am facing this error:

Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Handler dispatch failed: java.lang.OutOfMemoryError: Java heap space] with root cause

java.lang.OutOfMemoryError: Java heap space

I need to split the file in chunks to be able to upload to s3 using multi part file.


Solution

  • You cannot place 20Gb into heap. Default heap size sould be about 1/4 of the available RAM.

    You can increase the heap size manually using command line parameters (like this -Xmx30g). But it be a terribly bad idea.

    Create a buffer and upload every single chunk just after creation. Reuse the same buffer to load the next chunk.

    Alternative you can use temporary files.

    There are manuals how to do multipart uploads to S3 referencing parts of the existing file in the file system.

    https://www.baeldung.com/aws-s3-multipart-upload

    https://medium.com/@radha.kandala/uploading-large-files-made-easy-s3-multipart-upload-1f198f55f660