I am using compound datatypes with h5py, with some elements being variable-length arrays. I can't find a way to set the item. The following MWE shows 6 various ways to do that (sequential indexing — which would not work in h5py anyway, fused indexing, read-modify-commit for columns/rows), neither of which works.
What is the correct way? Why is h5py saying Cannot change data-type for object array
when writing integer list to int32
list?
with h5py.File('/tmp/test-vla.h5','w') as h5:
dt=np.dtype([('a',h5py.vlen_dtype(np.dtype('int32')))])
dset=h5.create_dataset('test',(5,),dtype=dt)
dset['a'][2]=[1,2,3] # does not write the value back
dset[2]['a']=[1,2,3] # does not write the value back
dset['a',2]=[1,2,3] # Cannot change data-type for object array
dset[2,'a']=[1,2,3] # Cannot change data-type for object array
tmp=dset['a']; tmp[2]=[1,2,3]; dset['a']=tmp # Cannot change data-type for object array
tmp=dset[2]; tmp['a']=[1,2,3]; dset[2]=tmp # 'list' object has no attribute 'dtype'
When working with compound datasets, I've discovered it's best to add all row data in a single statement. I tweaked your code and to show how add 3 rows of data (each of different length). Note how I: 1) define the row of data with a tuple; 2) define the list of integers with np.array()
; and 3) don't reference the field name ['a']
.
with h5py.File('test-vla.h5','w') as h5:
dt=np.dtype([('a',h5py.vlen_dtype(np.dtype('int32')))])
dset=h5.create_dataset('test',(5,),dtype=dt)
print (dset.dtype, dset.shape)
dset[0] = ( np.array([0,1,2]), )
dset[1] = ( np.array([1,2,3,4]), )
dset[2] = ( np.array([0,1,2,3,4]), )
For more info, take a look at this post on the HDF Group Forum under HDF5 Ancillary Tools / h5py:
Compound datatype with int, float and array of floats