I am implementing an external sort for a big file (~30GB), so after I have written the chunks in disk, I create chunks
times BufferedReader(new OutputStreamWriter(new FileOutputStream(outputPath), "UTF-8"), maxBufferSize)
being maxBufferSize = Runtime.getRuntime().freeMemory() / chunks
. However I get a OutOfMemory
Exception.
I guess that the garbage collector didn't have time enough to clean the memory (when I stop with the debugger it does not throw the exception), but in that case, why Runtime.getRuntime().freeMemory()
is given that result?
Is it possible to explicitly call the garbage collection or the only option is sleep the process for some time?
Is it possible to explicitly call the garbage collection
Yes it is possible. But it won't do any good.
The JVM will only throw an OOME after performing a full GC. Calling System.gc()
explicitly will (most likely) just waste CPU time.
Actually, I think your real problem is here:
I create
chunks
timesBufferedReader(new OutputStreamWriter(new FileOutputStream(outputPath), "UTF-8"), maxBufferSize)
beingmaxBufferSize = Runtime.getRuntime().freeMemory() / chunks
.
When you consider the various object overheads, (maxBufferSize + overheads) * chunks
is probably a bit greater than the amount of free memory.
In general, it is a bad idea to run with a Java heap close to full. Even if you don't run out of space entirely, you can find that running close to full will trigger many (too many) garbage collections.
In this case, you don't really get a lot of benefit from having really large I/O buffers. Buffers in the range of 8KB to 64KB should be fine ... is my gut feeling. See also Peter Lawrey's comment!