Search code examples
pythonnumpymultidimensional-arrayarray-broadcasting

NumPy ndarray broadcasting - shape (X,) vs (X, 1) to operate with (X,Y)


I have a NumPy ndarray which is shaped (32, 1024) and holds 32 signal measurements which I would like to combine into a single 1024 element long array, with a different weight for each of the 32. I was using numpy.average but my weights are complex and average performs a normalisation of the weights based on the sum which throws off my results.

Looking at the code for average I realised that I can accomplish the same thing by multiplying the weights by the signal array and then summing over the first axis. However when I try and multiply my (32,) weights array by the (32, 1024) signal array I get a dimension mismatch as the (32,) cannot be broadcast to (32, 1024). If I reshape the weights array to (32, 1) then everything works as expected, however this results in rather ugly code:

avg = (weights.reshape((32, 1)) * data).sum(axis=0)

Can anybody explain why NumPy will not allow my (32,) array to broadcast to (32, 1024) and/or suggest an alternative, neater way of performing the weighted average?


Solution

  • Generic setup for alignment between (X,) and (X,Y) shaped arrays

    On the question of why (32,) can't broadcast to (32, 1024), it's because the shapes aren't aligned properly. To put it into a schematic, we have :

    weights :         32
    data    :  32 x 1024 
    

    We need to align the only axis, which is the first axis of weights aligned to the first axis of data. So, as you discovered one way is to reshape to 2D, such that we would end up with a singleton dimension as the second axis. This could be achieved by introducing a new axis with None/np.newaxis : weights[:,np.newaxis] or weights[:,None] or a simple reshape : weights.reshape(-1,1). Hence, going back to the schematic, with the modified version we would have :

    weights[:,None] :  32 x    1
    data            :  32 x 1024
    

    Now, that the shapes are aligned, we can perform any generic element-wise operation between these two with the result schematic looking like so -

    weights[:,None] :  32 x    1
    data            :  32 x 1024
    result          :  32 x 1024
    

    This would broadcast weights and the relevant element-wise operation would be performed with data resulting in result.

    Solving our specific case and alternatives

    Following the discussion in previous section, to solve our case of element-wise multiplication, it would be weights[:,None]*data and then sum along axis=0, i.e. -

    (weights[:,None]*data).sum(axis=0)
    

    Let's look for neat alternatives!

    One neat and probably intuitive way would be with np.einsum -

    np.einsum('i,ij->j',weights,data)
    

    Another way would be with matrix-multiplication using np.dot, as we lose the first axis of weights against the first axis of data, like so -

    weights.dot(data)