Search code examples
c++hdf5

How to write/read jagged arrays in a HDF5 file using the C++ API?


I've multiple std::vector of different sizes containing floats. I would like to write/read them all as a jagged array in a HDF5 file (ideally one by one using hyperslabs since I can't hold all vectors in memory simultaneously). I believe I should use a regular array with each of its elements being of a variable length datatype, but all the examples I found were C examples. My code looks as follows:

#include <vector>
#include "H5Cpp.h"

int main() {
  std::vector<float> v1 {0.1, 0.2, 0.3};
  std::vector<float> v2 {0.4, 0.5};

  H5::VarLenType array_type (H5::PredType::NATIVE_FLOAT);

  hsize_t dimensions[1] = {2};
  H5::DataSpace dataspace (1, dimensions);

  H5::H5File file ("jarray.h5", H5F_ACC_TRUNC);
  H5::DataSet dataset = file.createDataSet("jarray", array_type, dataspace);

  hsize_t size[1] = {1};
  hsize_t offset[1] = {0};
  dataspace.selectHyperslab(H5S_SELECT_SET, size, offset);

  dataset.write(v1.data(), array_type);

  return 0;
};

If I leave out the call to the write function the code creates an empty file with the following structure (as printed out by h5dump):

HDF5 "jarray.h5" {
GROUP "/" {
   DATASET "jarray" {
      DATATYPE  H5T_VLEN { H5T_IEEE_F32LE}
      DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
      DATA {
      (0): (), ()
      }
   }
}
}

This makes me believe the dataset has the correct structure but I'm not getting the writing part right.

Could someone clarify how to write to such an array? How would one go about reading the values back afterwards? Any help would be much appreciated.


Solution

  • Not sure how to do this with HDF5 C++ API, but you could try HDFql as it alleviates you from HDF5 low-level details. Using HDFql in C++, you could do the following to write/read an HDF5 jagged array:

    // create HDF5 file 'jarray.h5' and use (i.e. open) it
    HDFql::execute("CREATE AND USE FILE jarray.h5");
    
    // create HDF5 dataset 'jarray' of one dimension (size 2) as a variable-length float (i.e. jagged)
    HDFql::execute("CREATE DATASET jarray AS VARFLOAT(2)");
    
    // write 0.1, 0.2 and 0.3 in first row of dataset 'jarray', 0.4 and 0.5 in second row
    HDFql::execute("INSERT INTO jarray VALUES((0.1, 0.2, 0.3), (0.4, 0.5))");
    
    // read first row of dataset 'jarray' using an hyperslab and populate cursor with values
    HDFql::execute("SELECT FROM jarray[0:::1]");
    
    // display values of first row
    while (HDFql::cursorNext() == HDFql::SUCCESS)
    {
         std::cout << *HDFql::cursorGetFloat() << std::endl;
    }
    
    // read second row of dataset 'jarray' using an hyperslab and populate cursor with values
    HDFql::execute("SELECT FROM jarray[1:::1]");
    
    // display values of second row
    while (HDFql::cursorNext() == HDFql::SUCCESS)
    {
         std::cout << *HDFql::cursorGetFloat() << std::endl;
    }
    

    This is a short example based on direct writing of values. If you need to write/read using user-defined memory (i.e. a variable), please take a look at the reference manual and examples to get additional information.