I am trying to implement logistic regression with regularization in Python using optimize.minimize
from the SciPy library. Here is my code:
import pandas as pd
import numpy as np
from scipy import optimize
l = 0.1 # lambda
def sigmoid(z):
return 1 / (1 + np.exp(-z))
def cost_function_logit(theta, X, y, l):
h = sigmoid(X @ theta)
# cost
J = -1 / m * (y.T @ np.log(h)
+ (1 - y).T @ np.log(1 - h)) \
+ l / (2 * m) * sum(theta[1:] ** 2)
# gradient
a = 1 / m * X.T @ (h - y)
b = l / m * theta
grad = a + b
grad[0] = 1 / m * sum(h - y)
return J, grad
data = pd.read_excel('Data.xlsx')
X = data.drop(columns = ['healthy'])
m, n = X.shape
X = X.to_numpy()
X = np.hstack([np.ones([m, 1]), X])
y = pd.DataFrame(data, columns = ['healthy'])
y = y.to_numpy()
initial_theta = np.zeros([n + 1, 1])
options = {'maxiter': 400}
res = optimize.minimize(cost_function_logit,
initial_theta,
(X, y, l),
jac = True,
method = 'TNC',
options = options)
An error occurs on the line where I use optimize.minimize
. The last two lines of the error are as follows:
grad = a + b
ValueError: operands could not be broadcast together with shapes (17,90) (17,)
I have checked the type and dimensions of X, y and theta, and they seem correct to me.
>>> type(X)
<class 'numpy.ndarray'>
>>> type(y)
<class 'numpy.ndarray'>
>>> type(theta)
<class 'numpy.ndarray'>
>>> X.shape
(90, 17)
>>> y.shape
(90, 1)
>>> theta.shape
(17, 1)
The error says a is a (17,90) matrix but based on my calculations it should be a (17,1) vector. Does anyone know where I went wrong?
I found a solution. Apparently, optimize.minimize
didn't like that y and theta had shapes (90,1) and (17,1), respectively. I converted their shape to (90,) and (17,) and the error message went away.
In terms of code, I changed
initial_theta = np.zeros([n + 1, 1])
to just this:
initial_theta = np.zeros([n + 1])
and I added the following line:
y = np.reshape(y, [m])
Thanks to those who tried to help me.