Search code examples
pythonarraysnumpybroadcastdimensions

Altering arrays of different dimensions to be broadcasted together


I am looking for a more optimized way to convert a (n,n) or (n,n,1) matrix to a (n,n,3) matrix. I start out with an (n,n,3), but my dimensions get reduced after I perform a sum over the second axis to (n,n). Essentially, I want to keep the original size of the array and have the second axis just repeated 3 times. The reason I need this is that I will later be broadcasting it with another (n,n,3) array, but they need the same dimensions.

My current method works, but does not seem elegant.

a0=np.random.random((n,n))
b=a.flatten().tolist()
a=np.array(zip(b,b,b))
a.shape=n,n,3

This setup has the desired result, but is clunky and hard to follow. Is there perhaps a way to go directly from an (n,n) to an (n,n,3) by duplicating the second index? or perhaps a way to not downsize the array to begin with?


Solution

  • None or np.newaxis is a common way of adding a dimension to an array. reshape with (3,3,1) works just as well:

    In [64]: arr=np.arange(9).reshape(3,3)
    In [65]: arr1 = arr[...,None]
    In [66]: arr1.shape
    Out[66]: (3, 3, 1)
    

    repeat as function or method replicates this.

    In [72]: arr2=arr1.repeat(3,axis=2)
    In [73]: arr2.shape
    Out[73]: (3, 3, 3)
    In [74]: arr2[0,0,:]
    Out[74]: array([0, 0, 0])
    

    But you might not need to do this. With broadcasting a (3,3,1) works with a (3,3,3).

    In [75]: (arr1+arr2).shape
    Out[75]: (3, 3, 3)
    

    In fact it will broadcast with a (3,) to produce (3,3,3).

    In [77]: arr1+np.ones(3,int)
    Out[77]: 
    array([[[1, 1, 1],
            [2, 2, 2],
            ...
           [[7, 7, 7],
            [8, 8, 8],
            [9, 9, 9]]])
    

    So arr1+np.zeros(3,int) is another way of expanding that (3,3,1) to (3,3,3).

    The broadcasting rules are:

    (3,3,1) + (3,) => (3,3,1) + (1,1,3) => (3,3,3)
    

    broadcasting adds dimensions at the start as needed.

    When you sum on an axis, you can keep the original number of dimensions with a parameter:

    In [78]: arr2.sum(axis=2).shape
    Out[78]: (3, 3)
    In [79]: arr2.sum(axis=2, keepdims=True).shape
    Out[79]: (3, 3, 1)
    

    This is handy if you want to subtract the mean from an array along any dimension:

    arr2-arr2.mean(axis=2, keepdims=True)