Problem that I am trying to solve is the following - I have a long running Python (can take many hours to finish) process that produces up to 80000 HDF5 files. As one of the bottlenecks is constant opening and closing of these files I decided to write a proof-of-concept code that uses a single HDF5 file as output that contains many tables. It certainly helps but I wonder if there is a quick(er) way to export specified tables (with renaming if possible) into a separate file?
Yes, there are at least 3 ways to copy the contents of a dataset from one HDF5 file to another. They include:
h5copy
command line utility from The HDF Group. You specify source and destination HDF5 files, along with source and destination objects. Likely this does exactly what you want without a lot of coding.copy()
method for groups and/or datasets. You input source and destination objects.copy_node()
method. A node is a group and/or a dataset. You input source and destination objects.If you choose to use h5py
, there are a couple of relevant posts on SO: