python machine-learning logistic-regression scipy-optimize

Python - ValueError: operands could not be broadcast together with shapes (17,90) (17,)

I am trying to implement logistic regression with regularization in Python using optimize.minimize from the SciPy library. Here is my code:

import pandas as pd
import numpy as np
from scipy import optimize

l = 0.1 # lambda

def sigmoid(z):

    return 1 / (1 + np.exp(-z))

def cost_function_logit(theta, X, y, l):

    h = sigmoid(X @ theta)

    # cost

    J = -1 / m * (y.T @ np.log(h)
                 + (1 - y).T @ np.log(1 - h)) \
                 + l / (2 * m) * sum(theta[1:] ** 2)

    # gradient

    a = 1 / m * X.T @ (h - y)
    b = l / m * theta
    grad = a + b
    grad[0] = 1 / m * sum(h - y)

    return J, grad

data = pd.read_excel('Data.xlsx')

X = data.drop(columns = ['healthy'])
m, n = X.shape
X = X.to_numpy()
X = np.hstack([np.ones([m, 1]), X])

y = pd.DataFrame(data, columns = ['healthy'])
y = y.to_numpy()

initial_theta = np.zeros([n + 1, 1])

options = {'maxiter': 400}
res = optimize.minimize(cost_function_logit,
                        initial_theta,
                        (X, y, l),
                        jac = True,
                        method = 'TNC',
                        options = options)

An error occurs on the line where I use optimize.minimize. The last two lines of the error are as follows:

grad = a + b

ValueError: operands could not be broadcast together with shapes (17,90) (17,)

I have checked the type and dimensions of X, y and theta, and they seem correct to me.

>>> type(X)
<class 'numpy.ndarray'>
>>> type(y)
<class 'numpy.ndarray'>
>>> type(theta)
<class 'numpy.ndarray'>
>>> X.shape
(90, 17)
>>> y.shape
(90, 1)
>>> theta.shape
(17, 1)

The error says a is a (17,90) matrix but based on my calculations it should be a (17,1) vector. Does anyone know where I went wrong?

Solution

I found a solution. Apparently, optimize.minimize didn't like that y and theta had shapes (90,1) and (17,1), respectively. I converted their shape to (90,) and (17,) and the error message went away.

In terms of code, I changed

initial_theta = np.zeros([n + 1, 1])

to just this:

initial_theta = np.zeros([n + 1])

and I added the following line:

y = np.reshape(y, [m])

Thanks to those who tried to help me.