Search code examples

dot product and addition of numpy arrays and vectors

consider the following MWE:

import numpy as np

def create_weight_matrix(nrows, ncols):
    """Create a weight matrix with normally distributed random elements."""
    return np.random.default_rng().normal(loc=0, scale=1/(nrows*ncols), size=(nrows, ncols))

def create_bias_vector(length):
    """Create a bias vector with normally distributed random elements."""
    return create_weight_matrix(length,1)

if __name__ == "__main__":

    num_samples = 100
    num_features = 5

    W = create_weight_matrix(4, num_features)
    b = create_bias_vector(4)

    x = np.random.rand(num_samples, num_features)

    y =[1])

    t = y + b

The intermediate variable y is computed correctly as the dot product of W and x[1] and the output is a vector of the form [ 0.06678158 0.02322523 0.09542323 -0.05746891]. I try then to add vector b and i would expect this is performed element wise, but for some reason the code adds all elements of b to all elements of y creating a 4x4 matrix at the end instead of 4x1 vector. What am I missing here?


  • b has a shape (4, 1) (a column-vector) and y has a shape (4,) (a row vector). When you add them together, they get broadcasted into an array of shape (4, 4). You can avoid this by explicitly reshaping one of the vectors to the same shape as the other.

    For a column vector of shape (4, 1), use any of the following lines:

    t = b + y.reshape(b.shape)
    t = b + y[:, None]

    y[:, None] adds a new axis to the 1d array

    or, for a row vector of shape (4,):

    t = b.reshape(y.shape) + y 
    t = b.squeeze() + y