I am writing a program that processes multiple files, each around 6 GB in size (big logfiles from a server). But I am only using 25% of my CPU (1 CPU thread out of 4 available) because I can't split the program in different threads, the work has to be done sequentially.
So, I was thinking about processing up to 4 files at the same time because I have a quad-core CPU but I am limited by random disk access performance of a HDD.
But in a few days, I'll will be using a laptop with SSD and 8 GB of ram. Would it be possible to map for instance the first 1 GB of each file in memory and process them in 4 different threads? And when I reach the end of the mapped file, I should be able to map the next 1 GB of the file in memory to proceed. Mapping 1 GB to memory should be not problem for a SSD I suppose because it gets around 400 MB/s read speed.
I know this can be done using FileChannel but I'm not sure about only mapping a part of the files.
Thanks, Siebe
When you memory map a file, the files is not actually transferred to memory (that would be the opposite of memory mapping).
Instead you are given a memory address which the kernel treats specially; when you access it, the kernel loads a page of memory with the file content. The pages then are unloaded when the OS decides to reclaim some memory; you can think of the mapped file somewhat as an extended swap space.
All this to say that, provided that you have enough memory addresses (that is, you have a 64bit OS and JVM), you can map a file bigger than the system memory.