Search code examples
pythonc++diskdiskspace

Write Raw Numbers to Disk


It occurred to me that I have no idea how to write raw numerical values to disk. How would I do this in Python or C++?!

I'm running some simulations and writing intermediate results to disk so that it doesn't start from scratch if it crashes. Sadly these values chomp up gigabytes upon gigabytes of space on my hard drive.

Would writing the numerical values to disk as floats take up significantly less disk space or is there some other overhead I'm not considering?


Solution

  • The most versatile and powerful option is to use the HDF5 format, with the help of the Python interface. From the website:

    It lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays. Thousands of datasets can be stored in a single file, categorized and tagged however you want

    It also has a C++ API.

    The HDF5 format is widely used in the scientific computing community and is read/written by many software. Data in the HDF5 format can be manipulated rapidly with the parallel utility tools.