I know the title is very general but I don't know of a better way to describe my question.
I'm using scipy's io.loadmat
to load a Matlab mat file. This mat file originally had some structs in it which I suppose were converted to numpy arrays. The structure of the mat file is as follows. There are 500 structs each with 3 fields.
print(data[0].shape)
(500, )
The first and second fields have elements of shape (300, 300)
print(data[0][0].shape)
(300, 300)
print(data[499][0].shape)
(300, 300)
print(data[0][1].shape)
(300, 300)
print(data[499][1].shape)
(300, 300)
The third field is a scalar
print(data[0][2].shape)
(1, 1)
print(data[499][2].shape)
(1, 1)
I want to split up this file so I have a variables of size (500, 300, 300), (500, 300, 300) and (500, )
I've tried
field1 = data[:][0]
but it gives the wrong elements. field1[0] = data[0][0]
, field1[1] = data[0][1]
, field1[2] = data[0][2]
and field1[3]
gives an invalid index error. I want field1[0] = data[0][0]
... field1[499] = data[499][0]
How do I index across the dimension of size 500?
I know I can do
field1 = np.array([data[i][0] for i in range(500)])
but I'm wondering if there's something simpler
Sounds like you have a structured array with 3 fields. Something along this line line:
two fields:
In [38]: dt = np.dtype([('f0',int,(2,2)),('f1','U3',(1,1))])
for records/items:
In [39]: data = np.zeros((4,), dtype=dt)
In [40]: data
Out[40]:
array([([[0, 0], [0, 0]], [['']]), ([[0, 0], [0, 0]], [['']]),
([[0, 0], [0, 0]], [['']]), ([[0, 0], [0, 0]], [['']])],
dtype=[('f0', '<i8', (2, 2)), ('f1', '<U3', (1, 1))])
In [41]: data.shape
Out[41]: (4,)
one record:
In [42]: data[0]
Out[42]: ([[0, 0], [0, 0]], [['']])
the field may be selected by number - because it is a tuple (or tuple-like):
In [43]: data[0][0]
Out[43]:
array([[0, 0],
[0, 0]])
but to select by field for all records, use the name:
In [45]: data['f0']
Out[45]:
array([[[0, 0],
[0, 0]],
[[0, 0],
[0, 0]],
[[0, 0],
[0, 0]],
[[0, 0],
[0, 0]]])
In [46]: data['f0'].shape
Out[46]: (4, 2, 2)