Search code examples
pythonarraysnumpymatlabmat-file

Numpy array with elements of different last axis dimensions


Assume the following code:

import numpy as np

x = np.random.random([2, 4, 50])
y = np.random.random([2, 4, 60])

z = [x, y]
z = np.array(z, dtype=object)

This gives a ValueError: could not broadcast input array from shape (2,4,50) into shape (2,4)

I can understand why this error would occur since the trailing (last) dimension of both arrays is different and a numpy array cannot store arrays with varying dimensions.

However, I happen to have a MAT-file which when loaded in Python through the io.loadmat() function in scipy, contains a np.ndarray with the following properties:

from scipy import io

mat = io.loadmat(file_name='gt.mat')

print(mat.shape)
> (1, 250)

print(mat[0].shape, mat[0].dtype)
> (250,) dtype('O')

print(mat[0][0].shape, mat[0][0].dtype)
> (2, 4, 54), dtype('<f8')

print(mat[0][1].shape, mat[0][1].dtype)
> (2, 4, 60), dtype('<f8')

This is pretty confusing for me. How is the array mat[0] in this file holding numpy arrays with different trailing dimensions as objects while being a np.ndarray itself and I am not able do so myself?


Solution

  • When calling np.array on a nested array, it will try to stack the arrays anyway. Note that you are dealing with objects in both cases. It is still possible. One way would be to first create an empty array of objects and then fill in the values.

    z = np.empty(2, dtype=object)
    z[0] = x
    z[1] = y
    

    Like in this answer.