Confusion regarding np.tile and numpy broadcasting

Suppose I have a 2D numpy array A of shape (m, n). I would like to create a 3D array B of shape (m, n, k) such that B[:, :, l] is a copy of A for any slice l. I could think of two ways to do this:

np.tile(A, (m, n, k))

np.repeat(A[:, :, np.newaxis], k, axis=-1)

The first way seems simpler, but I the docs for np.tile mentioned:

Note: Although tile may be used for broadcasting, it is strongly
recommended to use numpy's broadcasting operations and functions.

Why is this so, and is this an issue with np.repeat as well?

My other worry is that if m == n == k, then would np.tile() create confusion regarding which axis is augmented?

In summary, I have two questions:

Why is np.tile not preferred, and would m == n == k cause unexpected behavior in some cases?
Which of the two ways above is more efficient in terms of time and memory? Is there a cleaner or more efficient way than both approaches?

Solution

In [100]: A = np.arange(12).reshape(3,4)

Using repeat to add a new dimension at the end:

In [101]: B = np.repeat(A[:,:,np.newaxis], 2, axis=-1)
In [102]: B.shape
Out[102]: (3, 4, 2)

Using tile and repeat to add a new dimension at the beginning:

In [104]: np.tile(A, (2,1,1)).shape
Out[104]: (2, 3, 4)
In [105]: np.repeat(A[None,:,:], 2, axis=0).shape
Out[105]: (2, 3, 4)

If we specify 2 repeats on the last dimension with tile, it gives a different shape

In [106]: np.tile(A, (1,1,2)).shape
Out[106]: (1, 3, 8)

Note what tile says about prepending a dimension with the repeats tuple is larger than the shape.

But if you used the expanded array in a calculation as described in the comments, you don't need to make a full repeated copy. Temporary views of the right shape can be used instead, taking advantage of broadcasting.

In [107]: A1=np.arange(12).reshape(3,4)
In [108]: A2=np.arange(8).reshape(4,2)
In [109]: A3=A1[:,:,None] + A2[None,:,:]
In [110]: A3.shape
Out[110]: (3, 4, 2)
In [111]: A3
Out[111]: 
array([[[ 0,  1],
        [ 3,  4],
        [ 6,  7],
        [ 9, 10]],

       [[ 4,  5],
        [ 7,  8],
        [10, 11],
        [13, 14]],

       [[ 8,  9],
        [11, 12],
        [14, 15],
        [17, 18]]])

With the None (np.newaxis), the array views are (3,4,1) and (1,4,2) shaped, which broadcast together as (3,4,2). I could leave off the None in the 2nd case, since broadcasting will add the automatically. But the trailing None is required.

In [112]: (A1[:,:,None] + A2).shape
Out[112]: (3, 4, 2)

To add a 1d array (last dimension):

In [113]: (A1[:,:,None] + np.array([1,2])[None,None,:]).shape
Out[113]: (3, 4, 2)
In [114]: (A1[:,:,None] + np.array([1,2])).shape
Out[114]: (3, 4, 2)

Two basic broadcasting steps:

add size 1 dimensions as the start as needed (automatic [None,....])
expand all size 1 dimensions to the shared size

This set of calculations illustrate this:

In [117]: np.ones(2) + np.ones(3)
ValueError: operands could not be broadcast together with shapes (2,) (3,) 

In [118]: np.ones(2) + np.ones((1,3))
ValueError: operands could not be broadcast together with shapes (2,) (1,3) 

In [119]: np.ones(2) + np.ones((3,1))
Out[119]: 
array([[2., 2.],
       [2., 2.],
       [2., 2.]])
In [120]: np.ones((1,2)) + np.ones((3,1))
Out[120]: 
array([[2., 2.],
       [2., 2.],
       [2., 2.]])

with a missing middle dimension

In [126]: np.repeat(A[:,None,:],2,axis=1)+np.ones(4)
Out[126]: 
array([[[ 1.,  2.,  3.,  4.],
        [ 1.,  2.,  3.,  4.]],

       [[ 5.,  6.,  7.,  8.],
        [ 5.,  6.,  7.,  8.]],

       [[ 9., 10., 11., 12.],
        [ 9., 10., 11., 12.]]])

There is a more 'advanced' alternative (but not necessarily faster):

In [127]: np.broadcast_to(A[:,None,:],(3,2,4))+np.ones(4)