Machine learning beginner here.
In python 3.7, I keep getting this error when trying to run numpy.optimize's fmin_tnc
.
I know this type of question has been asked several times, but despite having checked my matrix dimensions and the code several times, I can't find my mistake.
Here is the function:
def compute_cost(theta, X, y, lambda_):
m = len(y)
mask = np.eye(len(theta))
mask[0,0] = 0
hypo = sigmoid(X @ theta)
func = y.T @ np.log(hypo) + (1-y.T) @ np.log(1-hypo)
cost = -1/m * func
reg_cost = cost + lambda_/(2*m) * (mask@theta).T @ (mask@theta)
grad = 1/m * X.T@(hypo-y) + lambda_/m * (mask@theta)
return reg_cost.item(), grad
Here are my dimensions:
X: (118, 3)
y: (118, 1)
theta: (3, 1)
The function call,
initial_theta = np.zeros((3,1))
lambda_ = 1
thetopt, nfeval, rc = opt.fmin_tnc(
func=compute_cost,
x0=initial_theta,
args=(X, y, 1)
)
And the error.
File "<ipython-input-21-f422f885412a>", line 16, in compute_cost
grad = 1/m * X.T@(hypo-y) + lambda_/m * (mask@theta)
ValueError: operands could not be broadcast together with shapes (3,118) (3,)
Thanks for your help!
In scipy.optimize.tnc, fmin_tnc function calls to _minimize_tnc, which seems to do the heavy lifting. In this function, almost the first thing it does (line 348) it to flatten x0:
x0 = asfarray(x0).flatten()
So what you need to do, is to reshape it in your function. Just add this line in the begging of your compute_cost function:
theta = theta.reshape((3, 1))