How to read a dataset that is compressed via lzf
compression filter and change it to native HDF5 third party filters like szip
or zlib
? Would simply reading it as shown in How to read HDF5 files in Python, and writing it with compression specified when writing a dataset work?
As @bnaecker said, you can copy the existing dataset and create a new one using a different compression filter. The new dataset can be in the same file or a new one. Note: szip
requires special licensing, so I created an example going from lzf
to gzip
. See example below. The process is the same for any 2 compression filters. Just change compression=value
.
import h5py
import numpy as np
filename = "SO_64582861.h5"
# Create random data
arr1 = np.random.uniform(-1, 1, size=(10, 3))
# Create intial HDF5 file
with h5py.File(filename, "w") as h5f:
h5f.create_dataset("ds_lzf", data=arr1, compression="lzf")
# Re-Open HDF5 file in 'append' mode
# Copy ds_lzf to ds_gzip with different compression setting
# could also copy to a second HDF5 file
with h5py.File(filename, "a") as h5f:
# List all groups
print("Keys: %s" % h5f.keys())
arr2 = h5f["ds_lzf"][:]
h5f.create_dataset("ds_gzip", data=arr2, compression="gzip")