I am trying to create an hdf5 file for storing some generated data. The saving part is fine (I think) but when it comes to retrieving, some of the data input appears to be lost. I will provide the code for both saving and loading
saving data
import numpy as np
import xarray as xa
import h5py
import string
import random
save_h5py = h5py.File(".\data.h5", "w")
ids = [''.join(random.choice(string.ascii_uppercase) for _ in range(5)) for _ in range(10)]
for i in ids:
data = np.random.rand(10)
my_array = xa.DataArray(data, dims= ["id"], coords = {"id":ids})
save_h5py.create_dataset(i, data=my_array )
save_h5py.close()
output for one of the xarray
<xarray.DataArray (id: 10)>
array([0.50655903, 0.33954833, 0.04186272, 0.16765385, 0.59900345,
0.58764172, 0.38523892, 0.77926545, 0.61928491, 0.65678961])
Coordinates:
* id (id) <U5 'ESBNB' 'LEQDR' 'XVKFK' ... 'SSWBW' 'VMKYK' 'QSHXN'
loading data
file = h5py.File(".\data.h5", "r")
data = file.get(ids[2])
data_array = data[:]
result for reading
<HDF5 dataset "XVKFK": shape (10,), type "<f2">
array([0.50655903, 0.33954833, 0.04186272, 0.16765385, 0.59900345,
0.58764172, 0.38523892, 0.77926545, 0.61928491, 0.65678961])
The trouble is here, how do I recall the ids? I tried many ways to access this data but with no luck. I thought the data might not have been saved so I tried to loaded the file in hdf5Viewer to see if the ids were present. However for some reason the program claims the file to be unreadable.
To get your code working (for me), I modified a few lines that create the datasets. See below:
for i in ids:
data = np.random.rand(10)
save_h5py.create_dataset(i, data=data)
save_h5py.close()
h5py uses Python's dictionary syntax to access HDF5 objects (key is the object name, and value is the object). Note: objects are not dictionaries! Code below shows how this works for your example:
with h5py.File('data.h5') as h5f:
for ds_name in h5f:
print(ds_name)
print(h5f[ds_name][()])
The example demonstrates 2 other important points:
with/as
file context manager to avoid file corruption
and locking issues.[()]
instead of [:]