Search code examples
pythonhdf5iterableh5pystopiteration

Iterator class failing to raise StopIteration on HDF5 data opened with h5py


I'm trying to implement an iterable class for the h5 dataset.

class Argh():
    def __init__(self, data):
        self.data = data
        self.c_idx = 0 

    def __getitem__(self, idx):
        return self.data[idx]

    def __len__(self):
        return len(self.data)

    def __next__(self):
        try:
            x = self.data[self.c_idx]
        except IndexError or ValueError:
            raise StopIteration
        self.c_idx += 1
        return x

    def __iter__(self):
        return self

When I try to go through it as through sequence it fails to raise StopIteration exception before ValueError or IndexError occures. This doesn't happen if I just iterate through the h5 dataset directly or if I use iteration by indexing and getitem:

with h5py.File('test.h5', 'w') as f:
     f.create_dataset(name='matrix', data=np.arange(10), dtype=np.float32)

f =  h5py.File('test.h5', 'r')
A = Argh(np.arange(0,10))
B = Argh(f['matrix'])

for x in A: pass
for x in B.data: pass
for i in range(len(B)): B[i]
for x in f['matrix']: pass
for x in B: pass
ValueError  Traceback (most recent call last)
<ipython-input-7-1dcb814e7a79> in <module>
      3 for i in range(len(B)): B[i]
      4 for x in f['matrix']: pass
----> 5 for x in B: pass
ValueError: Index (10) out of range (0-9)

I've tested this on several other objects including zarr arrays but observed this behavior only for h5py opened h5 datasets.


Solution

  • IndexError or ValueError is an expression that evaluates to IndexError. The syntax for matching multiple exception types is

    except (IndexError, ValueError):