Search code examples
pythonnumpynumpy-ndarrayarray-broadcasting

broadcast arrays to same shape, then get the minimum along one axis


I have 3 numpy arrays with different shapes (one not even an array but a scalar) or dimensions.

arr1.shape
# (3, 3, 11)
arr2  # this is the scalar
# 1
arr3.shape
# (3, 11)

I want to compute the minimum along a specific axis.

I tried like this:

np.minimum.accumulate([arr1, np.expand_dims(arr2, axis=(0,1,2)), np.expand_dims(arr3, axis=1)])

Result is:

VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I thought that would work because it matches the broadcasting rules

Two dimensions are compatible when
1. they are equal, or
2. one of them is 1.

Adding, subtracting or any other mathematical operation work, so why not also the minimum function?

# adding works as a proof that broadcasting works
arr1 + np.expand_dims(arr2, axis=(0,1,2)) + np.expand_dims(arr3, axis=1)

The way I achieve what I want is the following, but it does not feel like it is the correct numpy approach. Why do I have to "manually" broadcast the arrays to same shape, when I do not need to do it with the "usual" mathematical operations like adding and subtracting. Am I missing something or is this already the most "clean" way of doing it?

np.array(np.broadcast_arrays(arr1, np.expand_dims(arr2, axis=(0,1,2)), np.expand_dims(arr3, axis=1))).min(axis=0)

I want to add that I know about np.minimum(). But it only allows "comparison" of two arrays. I want to pass multiple arrays.


Solution

  • np.minimum.accumulate (or any ufunc accumulate) first makes an array from the input list. That's what gives the ragged array warning

    Adding object dtype as required, the result is a 3 element array, with element arrays that vary in shape:

    In [92]: np.array([arr1, np.expand_dims(arr2, axis=(0,1,2)), np.expand_dims(arr3, axis=1)],object)
    Out[92]: 
    array([array([[[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
                   [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
                   [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]],
    
                  [[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
                   [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
                   [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]],
    
                  [[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
                   [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
                   [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]]]),
           array([[[1]]]),
           array([[[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],
    
                  [[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],
    
                  [[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]])],
          dtype=object)
    

    np.add can work of this 3 element array:

    In [104]: np.add.reduce(_92).shape
    Out[104]: (3, 3, 11)
    In [105]: np.add.accumulate(_92).shape
    Out[105]: (3,)
    

    minumum, reduce or accumulate, give this ambiguity error:

    In [106]: np.minimum.reduce(_92).shape
    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    Cell In[106], line 1
    ----> 1 np.minimum.reduce(_92).shape
    
    ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
    

    That means it's comparing two arrays, and getting a boolean array.

    np.minimum(_92[0],_92[1]) works, but np.minimum.reduce(_92[0,1]) gives this error. It's almost as though it's doing some sort of if or or/and on the pairs; but that's hidden in compiled code.

    With the broadcast_arrays you make a 4d array, allowing the normal min on an axis:

    In [117]: np.array(np.broadcast_arrays(arr1, np.expand_dims(arr2, axis=(0,1,2)), np.expand_dims(arr3, axis=1))).shape
    Out[117]: (3, 3, 3, 11)