Search code examples
loggingfileoverhead

Overhead of log information in files


I am doing some long simulations that can take from several hours to several days and I am logging the information into files. The files can reach sizes of hundreds of Mb and inside there is just a list of numbers. I am really concern about the overhead that this is originating. I would like to ask if the overhead of using this method is really big and if there is any other more efficient method to do the same, just log information.

I am using C++ and to log the files I just use the common methods of fprintf. To explain the overhead if you can give a practical example similar to, using files it takes this time without using them this time, that will be ideally.

I did some test but I have no idea if the overhead grows lineally with the size of the files. What I am saying is that maybe is not the same add a line to a file of a size of 1Mb than a file of size of 1Gb. Does anyone know how the overhead grow with the size of the file?.


Solution

  • You just need some back-of-the-envelope calculations, I think.

    Let "hundreds of Mb" be 400MB.
    Let "several hours to several days" be 48 hours.

    (400 * 1024 * 1024 bytes) / (3600 * 48 seconds) = 2427 bytes/sec

    Obviously, you can just watch your system or use real numbers for the calculation, but using the rough estimate above you're logging about 2KB/sec, which is pretty trivial compared to the average hard-drive limits.

    So, no, the overhead doesn't appear to be very big. And yes, there's more efficient ways to do it, but you would probably spend more time and effort that it's worth for the miniscule savings you get unless your numbers are very different from what you stated.