Search code examples
pythonpython-3.xlarge-filesh5py

read part of h5 dataset python


I'm reading in large sets of data from an .h5 file, (200,000 points per dataset) and I currently don't need all of it so what I've been doing is reading in the data, then truncating it after.

Is there a way to only read the first X items of an h5 dataset?


Solution

  • Use this...

    import numpy as np
    import h5py
    
    filename = 'file.hdf5'
    f = h5py.File(filename, 'r')
    
    key = list(f.keys())[0]
    
    data = list(f[key][1])
    

    Indexing may vary for key and f[key], where [0] is an arbitrary dataset of file.hdf5 and [1] is just an arbitrary column I grabbed.