How to combine multiple hdf5 files into one file and dataset?

import h5py
import numpy as np

with h5py.File("myCardiac.hdf5", "w") as f:
    dset = f.create_dataset("mydataset", (100,), dtype = 'i')
    grp = f.create_group("G:/Brain Data/Brain_calgary/")

I tried this code to create a hdf5 file. There are 50 hhdf5 files in a folder. I want to combine all 50 hdf5 files into one hdf5 file dataset.

Solution

To merge 50 .h5 files, each with a dataset named kspace and the form (24, 170, 218, 256), into one large dataset, use this code:

import h5py
import os

with h5py.File("myCardiac.hdf5", "w") as f_dst:
    h5files = [f for f in os.listdir() if f.endswith(".h5")]

    dset = f_dst.create_dataset("mydataset", shape=(len(h5files), 24, 170, 218, 256), dtype='f4')

    for i, filename in enumerate(h5files):
        with h5py.File(filename) as f_src:
            dset[i] = f_src["kspace"]

Detailed description

Firstly, you must create a destination file myCardiac.hdf5. Then get the list of all .h5 files in the directory:

h5files = [f for f in os.listdir() if f.endswith(".h5")]

NOTE: os.listdir() without arguments gets list of files/foldes in the current working directory. I expect this python script to be in the same directory as the files and the CWD will be set to this directory.

The next step is to create a dataset in the destination file with the desired size and data type:

dset = f_dst.create_dataset("mydataset", shape=(len(h5files), 24, 170, 218, 256), dtype='f4')

You can then iteratively copy the data from the source files to the target dataset.

for i, filename in enumerate(h5files):
    with h5py.File(filename) as f_src:
        dset[i] = f_src["kspace"]