Search code examples
pythonhdf5

h5py open file with unknown datasets


I try to use h5py to open a file which was created by another program. Unfortunately I don't know the inner structure of the file. All I know is that it should contain a 20x20 matrix which I would like to process with numpy. Here is what I have done so far:

import numpy
import h5py
f = h5py.File('example.hdf5')
print(f.keys())

The result is as follows: KeysViewWithLock(<HDF5 file "example.hdf5" (mode r+)>)

How do I go from here? I want to access the matrix as a single numpy.ndarray. The h5py documentation always talks about creating hdf5 files, not reading unknown files. Thanks a lot.

SOLUTION (thanks to akash karothiya) use print(list(f.keys())) instead. That gives the names of groups/datasets which can then be accessed as a=f['dataset'].


Solution

  • Ok, as mentioned before akash karothiya helped me find the solution. Instead of print(f.keys()) use print(list(f.keys())). This returns ['dataset']. Using this information I can get an h5py dataset object which I then converted into a numpy array as follows:

    a = f['dataset']
    b = numpy.zeros(np.shape(a), dtype=complex)
    for i in range(numpy.size(a,0)):
        b[i,:] = np.asarray(a[i]['real'] + 1j*a[i]['imag'], dtype=complex)
    

    UPDATE:
    New version without for loop, potentially faster and very versatile (works for both complex and real data and cubes with dimensions NxMxO as well):

    a = f['dataset']
    if len(a.dtype) == 0:       
        b = np.squeeze(a[()])          
    elif len(a.dtype) == 2:       
        b = np.squeeze(a[()]['real'] + 1.0j*a[()]['imag'])