I'm trying to load a dataset from a HDF5 file in C# (.NET Framework) in such a way that I have the contents in an array, e.g. float[,]
. I found the HDF.PInvoke library, but I find it very difficult to figure out how to use it.
From Soonts answer, I managed to get it to work. Here's my working snippet:
using System;
using System.Runtime.InteropServices;
using HDF.PInvoke;
namespace MyNamespace
{
class Program
{
static void Main()
{
string datasetPath = "/dense1/dense1/kernel:0";
long fileId = H5F.open(@"\path\to\weights.h5", H5F.ACC_RDONLY);
long dataSetId = H5D.open(fileId, datasetPath);
long typeId = H5D.get_type(dataSetId);
// read array (shape may be inferred w/ H5S.get_simple_extent_ndims)
float[,] arr = new float[162, 128];
GCHandle gch = GCHandle.Alloc(arr, GCHandleType.Pinned);
try
{
H5D.read(dataSetId, typeId, H5S.ALL, H5S.ALL, H5P.DEFAULT,
gch.AddrOfPinnedObject());
}
finally
{
gch.Free();
}
// show one entry
Console.WriteLine(arr[13, 87].ToString());
// Keep the console window open in debug mode.
Console.WriteLine("Press any key to exit.");
Console.ReadKey();
}
}
}
What I've managed so far:
using System;
using System.IO;
using System.Runtime.InteropServices;
using HDF.PInvoke;
namespace MyNamespace
{
class Program
{
static void Main()
{
string datasetPath = "/dense1/dense1/bias:0";
long fileId = H5F.open(@"\path\to\weights.h5", H5F.ACC_RDONLY);
long dataSetId = H5D.open(fileId, datasetPath);
long typeId = H5D.get_type(dataSetId);
long spaceId = H5D.get_space(dataSetId);
// not sure about this
TextWriter tw = Console.Out;
GCHandle gch = GCHandle.Alloc(tw);
// I was hoping that this would write to the Console, but the
// program crashes outside the scope of the c# debugger.
H5D.read(
dataSetId,
typeId,
H5S.ALL,
H5S.ALL,
H5P.DEFAULT,
GCHandle.ToIntPtr(gch)
);
// Keep the console window open in debug mode.
Console.WriteLine("Press any key to exit.");
Console.ReadKey();
}
}
}
The signature for H5F.read()
is:
Type Name Description
--------------------------------------------------------------
long dset_id Identifier of the dataset read from.
long mem_type_id Identifier of the memory datatype.
long mem_space_id Identifier of the memory dataspace.
long file_space_id Identifier of the dataset's dataspace in the file.
long plist_id Identifier of a transfer property list for this I/O operation.
IntPtr buf Buffer to receive data read from file.
Could anyone help me fill in the blanks here?
You need to create an array (normal 1D one, not the 2D) of the correct size and type. Then write something like this:
int width = 1920, height = 1080;
float[] data = new float[ width * height ];
var gch = GCHandle.Alloc( data, GCHandleType.Pinned );
try
{
H5D.read( /* skipped */, gch.AddrOfPinnedObject() );
}
finally
{
gch.Free();
}
This will read the dataset into the data
array, you can then copy individual lines into another, 2D array if you need that.
Read API documentation how to get dimensions (HDF5 supports data set of arbitrary dimensions) and size of the dataset (for 2D dataset the size is 2 integers), i.e. how to find out the buffer size you need (for 2D dataset, it's width * height
).
As for the elements type, you better know that in advance, e.g. float
is fine.