I have a big HDF5 file with the images and its corresponding ground truth density map. I want to put them into the network CRSNet and it requires the images in separate files. How can I achieve that? Thank you very much.
-- Basic info I have a HDF5 file with two keys "images" and "density_maps". Their shapes are (300, 380, 676, 1). 300 stands for the number of images, 380 and 676 refer to the height and width respectively.
-- What I need to put into the CRSNet network are the images (jpg) with their corresponding HDF5 files. The shape of them would be (572, 945).
Thanks a lot for any comment and discussion!
For starters, a quick clarification on h5py and HDF5. h5py is a Python package to read HDF5 files. You can also read HDF5 files with the PyTables package (and with other languages: C, C++, FORTRAN).
I'm not entirely sure what you mean by "the images (jpg) with their corresponding h5py (HDF5) files" As I understand all of your data is in 1 HDF5 file. Also, I don't understand what you mean by: "The shape of them would be (572, 945)." This is different from the image data, right? Please update your post to clarify these items.
It's relatively easy to extract data from a dataset. This is how you can get the "images" as NumPy arrays and and use cv2 to write as individual jpg files. See code below:
with h5py.File('yourfile.h5','r') as h5f:
for i in range(h5f['images'].shape[0]):
img_arr = h5f['images'][i,:] # slice notation gets [i,:,:,:]
cv2.imwrite(f'test_img_{i:03}.jpg',img_arr)
Before you start coding, are you sure you need the images as individual image files, or individual image data (usually NumPy arrays)? I ask because the first step in most CNN processes is reading the images and converting them to arrays for downstream processing. You already have the arrays in the HDF5 file. All you may need to do is read each array and save to the appropriate data structure for CRSNet to process them. For example, here is the code to create a list of arrays (used by TensorFlow and Keras):
image_list = []
with h5py.File('yourfile.h5','r') as h5f:
for i in range(h5f['images'].shape[0]):
image_list.append( h5f['images'][i,:] ) # gets slice [i,:,:,:]