Search code examples
pythonmatlabhdf5hdf5storage

Understand python hdf5storage functions


Is there anyone explain exact difference between hdf5storage.write() and hdf5storage.writes() functions. I've read documents but I did not understand it.


Solution

  • This is explained in the hdf5storage docs. Quoting:

    • The main functions in this module are write() and read() which write a single Python variable to an HDF5 file (or reads and returns the read data).
    • Version 0.1.10 added added two new functions: writes() and reads().
    • They write and read more than one Python variable at once, although they can still work with a single variable.
    • Calling write() opens and closes the HDF5 file. As a result, calling write() multiple times for multiple variables incurs a performance penalty. This is most noticeable with large HDF5 files.
    • In addition, savemat() and loadmat() (to work with MATLAB data) now use writes() and reads() for improved performance.

    Complete documentation is here for write() and writes().

    For a Python variable named 'a', a simple write() call looks like:

    hdf5storage.write(a, path='/a', filename='data.h5') 
    

    The writes() call uses a dictionary where keys are HDF5 paths and values are data to write to the file. For a dictionary name mdict, the call looks like this:

    hdf5storage.writes(mdict, filename='data.h5') 
    

    Examples of each below for 3 arrays:

    arr1 = np.arange(10)
    arr2 = np.arange(10,20).reshape(5,2)
    arr3 = np.arange(20,30).reshape(2,5)
    hdf5storage.write(arr1, path='/arr1', filename='write_data.h5')
    hdf5storage.write(arr2, path='/arr2', filename='write_data.h5')
    hdf5storage.write(arr3, path='/arr3', filename='write_data.h5')
    
    mdict = {'/arr1':arr1, '/arr2':arr2, '/arr3':arr3}
    hdf5storage.writes(mdict, filename='writes_data.h5')
    

    The resulting files should be the same.