Search code examples
pythonpython-3.xhdf5h5py

Remove an external linked HDF5 using Python (h5py)


Deleting data from a HDF5 file requires repacking of the master file. Since we use large sub-db files (measurement data), separate HDF5 datafiles are externally linked into the masterfile. As the linked files take up little space you could argue if a repack will be necessary.

As HDF5 database can get corrupted, what is the procedure to remove the externally linked databases (h5py.ExternalLink) correctly from the master HDF5 db file?


Solution

  • The ExternalLink object behaves like other objects (groups and datasets). So, you can use del and reference the file/link object. For example, if you have an External Link named ['/my_linked_ds'] you can delete like this:

    h5f = h5py.File('yourfile.h5', 'r+')
    del (h5f['/my_linked_ds'])
    h5f.close()