I want to use Python3 package h5py to read a matlab .mat file of version 7.3.
It contains a variable in matlab, named results.
It contains a 1*1 cell, and the value in the struct inside is what I need.
In matlab, I can get these data through the following code:
load('.mat PATH');
results{1}.res
How should I read this data in h5py? Example .mat file can be obtained from here
While h5py
can read h5
files from MATLAB, figuring out what is there takes some exploring - looking at keys
groups
and datasets
(and possibly attr). There's nothing in scipy
that will help you (scipy.io.loadmat
is for the old MATLAB mat format).
With the downloaded file:
In [61]: f = h5py.File('Downloads/Basketball_ECO_HC.mat','r')
In [62]: f
Out[62]: <HDF5 file "Basketball_ECO_HC.mat" (mode r)>
In [63]: f.keys()
Out[63]: <KeysViewHDF5 ['#refs#', 'results']>
In [65]: f['results']
Out[65]: <HDF5 dataset "results": shape (1, 1), type "|O">
In [66]: arr = f['results'][:]
In [67]: arr
Out[67]: array([[<HDF5 object reference>]], dtype=object)
In [68]: arr.item()
Out[68]: <HDF5 object reference>
I'd have to check the h5py
docs to see if I can check that object reference further. I'm not familiar with it.
But exploring the other key
:
In [69]: list(f.keys())[0]
Out[69]: '#refs#'
In [70]: f[list(f.keys())[0]]
Out[70]: <HDF5 group "/#refs#" (2 members)>
In [71]: f[list(f.keys())[0]].keys()
Out[71]: <KeysViewHDF5 ['a', 'b']>
In [72]: f[list(f.keys())[0]]['a']
Out[72]: <HDF5 dataset "a": shape (2,), type "<u8">
In [73]: _[:]
Out[73]: array([0, 0], dtype=uint64)
In [74]: f[list(f.keys())[0]]['b']
Out[74]: <HDF5 group "/#refs#/b" (7 members)>
In [75]: f[list(f.keys())[0]]['b'].keys()
Out[75]: <KeysViewHDF5 ['annoBegin', 'fps', 'fps_no_ftr', 'len', 'res', 'startFrame', 'type']>
In [76]: f[list(f.keys())[0]]['b']['fps']
Out[76]: <HDF5 dataset "fps": shape (1, 1), type "<f8">
In [77]: f[list(f.keys())[0]]['b']['fps'][:]
Out[77]: array([[22.36617883]])
In the OS shell , I can look at the file with h5dump
. From that it looks like the res
dataset has the most data. The datasets also have attributes. That may be a better way of getting an overview, and use that to guide the h5py
loads.
In [80]: f[list(f.keys())[0]]['b']['res'][:]
Out[80]:
array([[198., 196., 195., ..., 330., 328., 326.],
[214., 214., 216., ..., 197., 196., 192.],
[ 34., 34., 34., ..., 34., 34., 34.],
[ 81., 81., 81., ..., 81., 80., 80.]])
In [81]: f[list(f.keys())[0]]['b']['res'][:].shape
Out[81]: (4, 725)
In [82]: f[list(f.keys())[0]]['b']['res'][:].dtype
Out[82]: dtype('<f8')