Search code examples
c#arraysmultidimensional-arrayhdf5

Write Chunked data array to HDF5


I have this code to write an array to Hdf5 with HDF5Sharp. But the problem is that I need the data to be written in chunks of 1 x 100 x 500 instead of 100k x 100 x 500 and I cannot figure out how to do it.

using HDF5CSharp;

// array of 100k x 100 x 500
float[,,] traindata = new float[100000, 100, 500];

// ...
// fill the array with data
// ...

long fileId = Hdf5.CreateFile("d:\\Temp\\Test.H5");
var groupId = Hdf5.CreateOrOpenGroup(fileId, "data");

Hdf5.WriteDataset(groupId, "data", traindata);

Hdf5.CloseGroup(groupId);
Hdf5.CloseFile(fileId);

Solution

  • Not sure how this is done with the library you have indicated but with HDFql, a high-level language that abstracts you from low-level details of HDF5, your use-case can be solved as follows in C#:

    // use HDFql namespace (make sure it can be found by the C# compiler)
    using AS.HDFql;
    
    // declare variables
    float[,,] traindata = new float[1, 100, 500];
    int number;
    int i;
    
    // create an HDF5 file named 'Test.h5' and use (i.e. open) it
    HDFql.Execute("create and use file Test.h5");
    
    // create a chunked dataset named 'data' (within a group named 'grp') of data type float with three dimensions (the first dimension is extendible)
    HDFql.Execute("create chunked(1, 100, 500) dataset grp/data as float(0 to unlimited, 100, 500)");
    
    // register variable 'traindata'
    number = HDFql.VariableRegister(traindata);
    
    // loop 100000 times
    for(i = 0; i < 100000; i++)
    {
       // populate variable 'traindata'
       // (...)
    
       // insert (i.e. write) data stored in 'traindata' into the last position of 'data' (using a hyperslab selection)
       HDFql.Execute("insert into grp/data[-1:::] values from memory " + number);
    
       // alter (i.e. extend) first dimension of 'data' plus one unit
       HDFql.Execute("alter dimension grp/data to +1");
    }