Search code examples
pythoncompressionhdf5h5py

Determine if a HDF5 file created with h5py is compressed or not


I have a fairly simple question:

Can I determine if an HDF5 file generated with h5py is compressed or not (without reading the data in it)? I need to know it because I would like to change my strategy depending on whether it is compressed or not.

Apparently, I could not find an answer, but I apologise if it was already asked.


Solution

  • Compression is handled as a dataset attribute. In other words, some might be compressed, others might not. You DO NOT need to know if the dataset is compressed when reading the data values -- it is handled automatically.

    However, if you still want to do this, there are several ways to check.

    1. HDF5 h5dump Utility: h5dump -H -p filename
    2. HDF5 h5ls Utility: h5ls -v filename
    3. A small amount of Python/h5py code to get dataset's .compression attribute.

    Python code below:

    with h5py.File('yourfile.h5') as h5f:
         print (h5f['dataset_name'].compression)