I am pasting some code here which uses boost iostream to mmap & then writes to the mapped file:
typedef unordered_map<int, string> work;
int main()
{
work d;
d[0] = "a";
boost::iostreams::mapped_file_params params;
params.path = "map.dat";
params.new_file_size = 1000000000;
params.mode = (std::ios_base::out | std::ios_base::in);
boost::iostreams::mapped_file mf;
mf.open(params);
work* w = static_cast<work*>((void*)mf.data());
w[0] = d;
for(int i=1; i <1000000000 ;++i)
{
w->insert(std::make_pair(i, "abcdef"));
}
mf.close();
}
When i executed this on my centos 6 box with 8 processors and 16GB RAM, i observed the below:
When the data was being inserted into the memory mapped file, RES (from top command) was increasing continuously and it reached till 14GB. I was under the impression that when i mmap a file VIRT will increase and not RES. So is it that when we write to the mmap file, first its written to the memory and then commited to the disk? Or is there any intermediate buffer/cache used?
With the help of "free" command , i also observed that after the memory usage reaches 16GB, buffers are used. Here are some snapshots of free command at different times when the above code was executing:
total used free shared buffers cached
Mem: 16334688 10530380 5804308 0 232576 9205532
-/+ buffers/cache: 1092272 15242416
Swap: 18579448 348020 18231428
total used free shared buffers cached
Mem: 16334688 13594208 2740480 0 232608 9205800
-/+ buffers/cache: 4155800 12178888
Swap: 18579448 348020 18231428
total used free shared buffers cached
Mem: 16334688 15385944 948744 0 232648 9205808
-/+ buffers/cache: 5947488 10387200
Swap: 18579448 348020 18231428
total used free shared buffers cached
Mem: 16334688 16160368 174320 0 204940 4049224
-/+ buffers/cache: 11906204 4428484
Swap: 18579448 338092 18241356
total used free shared buffers cached
Mem: 16334688 16155160 179528 0 141584 2397820
-/+ buffers/cache: 13615756 2718932
Swap: 18579448 338092 18241356
total used free shared buffers cached
Mem: 16334688 16195960 138728 0 5440 17556
-/+ buffers/cache: 16172964 161724
Swap: 18579448 572052 18007396
What does this behavior signify?
It took a lot of time to write data to memory mapped file compared to writing into memory. What is the reason for this?
I wanted to use memory mapping to bring down the RES usage as i deal with huge data. But it does not seem to work that way. Wanted to keep all the data in memory mapped files and read them back when required.
Am I using memory mapping incorrectly? Or that's the way it behaves?
VIRT will increase immediately (all pages are mapped into the process address space). RES will increase whenever pages are used, which causes them to be paged into the physical memory.
This happens for as long as there is sufficient memory available, after which the OS starts purging LRU pages from the reserved sets (unless they were VirtualLock
/mlock
-ed or are otherwise unmovable (like kernel pages, DMA buffers, security sensitive data etc.).
So, the OS optimistically leaves the pages reserved as long as possible (which improves performance as long as no other processes contend for the memory).
This signifies that the OS is doing it's job.
You're writing to disk. Disk access is (a lot) slower than memory access. How often the data actually gets written out to disk depends on tuning. This answer lists some of the tuning parameters that are available on linux (which you seem to be using):