Search code examples
pythonnumpylogistic-regressionarray-broadcasting

logistic regression and numpy: ValueError: operands could not be broadcast together with shapes


Machine learning beginner here.
In python 3.7, I keep getting this error when trying to run numpy.optimize's fmin_tnc.
I know this type of question has been asked several times, but despite having checked my matrix dimensions and the code several times, I can't find my mistake.

Here is the function:

def compute_cost(theta, X, y, lambda_):
    m = len(y)
    mask = np.eye(len(theta))
    mask[0,0] = 0

    hypo = sigmoid(X @ theta)
    func = y.T @ np.log(hypo) + (1-y.T) @ np.log(1-hypo)
    cost = -1/m * func
    reg_cost = cost + lambda_/(2*m) * (mask@theta).T @ (mask@theta)

    grad = 1/m * X.T@(hypo-y) + lambda_/m * (mask@theta)

    return reg_cost.item(), grad

Here are my dimensions:

X: (118, 3)
y: (118, 1)
theta: (3, 1)

The function call,

initial_theta = np.zeros((3,1))
lambda_ = 1

thetopt, nfeval, rc = opt.fmin_tnc(
    func=compute_cost, 
    x0=initial_theta, 
    args=(X, y, 1)
)

And the error.

File "<ipython-input-21-f422f885412a>", line 16, in compute_cost
    grad = 1/m * X.T@(hypo-y) + lambda_/m * (mask@theta)

ValueError: operands could not be broadcast together with shapes (3,118) (3,)

Thanks for your help!


Solution

  • In scipy.optimize.tnc, fmin_tnc function calls to _minimize_tnc, which seems to do the heavy lifting. In this function, almost the first thing it does (line 348) it to flatten x0:

    x0 = asfarray(x0).flatten()
    

    So what you need to do, is to reshape it in your function. Just add this line in the begging of your compute_cost function:

    theta = theta.reshape((3, 1))