Search code examples
python-3.xhdf5h5py

How to store file paths and read file paths from hdf5 with h5py?


I would like to save a very long file path as string in my hdf5 with h5py. I have the following code, but it does not seem to work. When I read the file, the variable does not show the file path. How to do better, please? Thank you.

import h5py

hdf5filename='testhdf5.h5'
hdf5dsetname_origin="/entry/origin/filename"

# create hdf5 file and store a very long file path

with h5py.File(hdf5filename, "w") as h5f:
    string_dt = h5py.special_dtype(vlen=str)
    h5f.create_dataset(hdf5dsetname_origin, data='/path/to/data/verylong/verylong/verlong/extralong',dtype=string_dt)           

# read it and check the file path

with h5py.File(hdf5filename,"r") as h5:
    string=h5["/entry/origin/filename"]

print(string)


Solution

  • Creating a dataset to store a small scrap of data is overkill. Attributes are designed for this purpose (sometimes called metadata). The example below accomplishes this task. (I created an attribute with your filename plus a few extra to demonstrate attribute usage for floats and ints.) I added these attirbutes to the root level group. You can add attributes to any group or dataset.) Complete details here: h5py Attributes.

    hdf5filename='test2hdf5.h5'
    hdf5dsetname_origin="entry/origin/filename"
    datastring="/path/to/data/verylong/verylong/verlong/extralong"
        
    with h5py.File(hdf5filename, "w") as h5w:
        h5w.attrs[hdf5dsetname_origin] = datastring
        h5w.attrs['version'] = 1.01
        h5w.attrs['int_attr3'] = 3
    
    with h5py.File(hdf5filename,"r") as h5r:
        for attr in h5r.attrs.keys():
            print(f'{attr} => {h5r.attrs[attr]}')
    

    Output looks like this:

    entry/origin/filename => /path/to/data/verylong/verylong/verlong/extralong
    int_attr3 => 3
    version => 1.01