python numpy for-loop multidimensional-array vectorization

Vectorizing loop of row-plus-matrix operations

I have matrices A (r x k) and B (m x k), and I want to add each row A[:,i] of the first to B in its entirety and return a separate matrix for each row of the first (equivalently, a 3d matrix of dimension r x m x k). I.e.:

import numpy as np

v = np.array([[1,2],[3,4]])
w = np.zeros((5,2))

def mysum(a,b):
    return np.array([a[i]+b for i in range(a.shape[0])])

mysum(v,w)

This returns the desired output, albeit slowly:

[array([[1., 2.],
        [1., 2.],
        [1., 2.],
        [1., 2.],
        [1., 2.]]),
 array([[3., 4.],
        [3., 4.],
        [3., 4.],
        [3., 4.],
        [3., 4.]])]

I tried using the axes argument in np.add, but it didn't accept the keyword.

Is there a vectorized way to do this? I cannot figure out how to do it without a slow for-loop. I read elsewhere on Overflow that apply_along_axis is merely a for-loop in disguise, and my timing tests show that the following only runs in slightly less than half the time compared to the above list-comprehension solution, and my experience with vectorization leads me to believe much faster speeds are possible:

def notsofastsum(t,x):
    return np.apply_along_axis(np.add,1,t,x)

And I don't think using vectorize will work, since I found this in the docs:

The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.

Solution

Your arrays are:

In [193]: v.shape, w.shape
Out[193]: ((2, 2), (5, 2))

and the result shape: (2, 5, 2)

We can use broadcasting to do this:

In [195]: v[:,None,:]+w           # (2,1,2) + (5,2)
Out[195]: 
array([[[1., 2.],
        [1., 2.],
        [1., 2.],
        [1., 2.],
        [1., 2.]],

       [[3., 4.],
        [3., 4.],
        [3., 4.],
        [3., 4.],
        [3., 4.]]])

Your loop also use broadcasting.

In [196]: v[0].shape
Out[196]: (2,)

It's adding a (2,) shape to a (5,2) - which by broadcasting is (1,2) added to (5,2) => (5,2). And you do this on a loop on the first dimension of v.