We have a data capture system that is connected to a very fast 10TB raid 0 jbod.
We receive 4 MiB data buffers at approximately 1.25 GB/s which are written to a sequential file which was opened with fopen, 10 GiB is fallocate'd, and written to with fwrite. every 10 GiB we fflush then fallocate gets another 10 GiB. Lastly it's closed after capture is complete with fclose.
The problem is that while the capture is underway, we can see /proc/meminfo MemFree drop, and Cached shoot up - i.e. the fflush seems to do nothing. This proceeds until we have about 200 MiB MemFree in the system, and now the data rate becomes extremely spikey, which causes our capture to fail.
We were hoping that the spikes would fall around the 10 GiB when we call fflush, but it just doesn't seem to do anything. The file isn't flushed until we call fclose.
Any reason for this behavior? using setvbuf(hFile, NULL, _IONBF, 0) doesn't seem to have any effect either.
When you see your free memory drop, that's your OS writing to its buffer cache (essentially, all available memory). In addition, stdio's fwrite()
is buffering on its own. Because of this, there's some resource contention going on. When your OS hits the upper limits of available memory, this resource contention causes slower writes and high memory utilization. The bottleneck causes you to miss data captures.
Since you are managing your own buffer, it would be possible to use write()
with O_DIRECT
to avoid all this buffering.