Search code examples
pythonnumpy

Why do I get a value error while creating a ragged array but only for a certain shape?


import numpy as np

for h in range(10):
    try:
        array = np.array([np.zeros((h, 4)), np.zeros((3, h))], dtype=object)
    except ValueError:
        print(f'Value Error for h={h} only.')

In the above code, ValueError only happens for h=3. This seems arbitrary.

The full error being,

  File "path/to/arr.py", line 4, in <module>
    array = np.array([np.zeros((h, 4)), np.zeros((3, h))], dtype=object)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: could not broadcast input array from shape (3,4) into shape (3,)

How may I avoid this and why does this happen?


Solution

    • When you create an array of arrays in numpy, it tries to combine them into a regular, multi-dimensional array if possible.
    • At h=3 both the arrays happen to have same no.of rows.
      • first array : (3,4)
      • second array : (3,3)
    • So numpy thinks u might want to stack these arrays, but their column sizes 4 & 3 don't match, resulting in ValueError.
    • For other values of h, the no.of rows in the array are different, so it doesn't attempt to stack them.
    • We need to explicitly pass each array as separate object by creating an empty object array & assigning the arrays individually, so numpy won't combine the arrays.
    import numpy as np
    
    for h in range(10):
        array = np.empty(2, dtype=object)
        array[0] = np.zeros((h, 4))
        array[1] = np.zeros((3, h))