Search code examples
pythonhdf5

How to read HDF5 files in Python


I am trying to read data from hdf5 file in Python. I can read the hdf5 file using h5py, but I cannot figure out how to access data within the file.

My code

import h5py    
import numpy as np    
f1 = h5py.File(file_name,'r+')    

This works and the file is read. But how can I access data inside the file object f1?


Solution

  • Read HDF5

    import h5py
    filename = "file.hdf5"
    
    with h5py.File(filename, "r") as f:
        # Print all root level object names (aka keys) 
        # these can be group or dataset names 
        print("Keys: %s" % f.keys())
        # get first object name/key; may or may NOT be a group
        a_group_key = list(f.keys())[0]
    
        # get the object type for a_group_key: usually group or dataset
        print(type(f[a_group_key])) 
    
        # If a_group_key is a group name, 
        # this gets the object names in the group and returns as a list
        data = list(f[a_group_key])
    
        # If a_group_key is a dataset name, 
        # this gets the dataset values and returns as a list
        data = list(f[a_group_key])
        # preferred methods to get dataset values:
        ds_obj = f[a_group_key]      # returns as a h5py dataset object
        ds_arr = f[a_group_key][()]  # returns as a numpy array
    

    Write HDF5

    import h5py
    
    # Create random data
    import numpy as np
    data_matrix = np.random.uniform(-1, 1, size=(10, 3))
    
    # Write data to HDF5
    with h5py.File("file.hdf5", "w") as data_file:
        data_file.create_dataset("dataset_name", data=data_matrix)
    

    See h5py docs for more information.

    Alternatives

    For your application, the following might be important:

    • Support by other programming languages
    • Reading / writing performance
    • Compactness (file size)

    See also: Comparison of data serialization formats

    In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python