python machine-learning logistic-regression

Questions encountered dealing with logistic regression using python

Right now i'm trying to solve the tasks from the ML course by Andrew Ng. I'm using python to do the logistic regression, but the result is quite confusing for me. And there's something wrong when using scipy.opt function

I'm asked to do classification by using logistic regression. The functions for calculating cost and gradient are:

So instead of using opt function provided in scipy, i want to insist on gradient-descent to update the parameter, as taught in the previous tutorial.

But when i use different learning rate(alpha), the cost changes differently.

Here's the result when alpha=0.0001,iteration=400000

cost result

And...Here's the result when alpha=0.01,iteration=200000

cost result

Seems like this chaos-like result is hard to explained for me, although the final result is acceptable. Is there any explanation for this situation?

And something went wrong when using scipy.opt Seems like this chaos-like result is hard to explained for me, although the final result is acceptable. Is there any explanation for this situation?

m,n=data.shape 
X=data[:,0:n-1]
y=data[:,-1:]
theta=np.array([[0] for i in range(n)],dtype=np.float64)
print(X.shape,y.shape,theta.shape)

# OUT:(100, 2) (100, 1) (3, 1)

X=np.concatenate((np.array([[1] for i in range(m)]),X),axis=1)

def sigmoid(z):
    return 1/(1+np.exp(-z))
def h(X,theta):
    return sigmoid(X@theta)
def calculate_cost(X,y,theta):
    first=-y.T @ np.log(h(X,theta)+1e-5)  
    second=(1-y).T@np.log(1-h(X,theta)+1e-5)   
    return 1/m*(first-second)[0][0]
def gradient(X,y,theta):
    return  1/m*(X.T@(h(X,theta)-y)).T

result = opt.fmin_tnc(func=calculate_cost, x0=theta, fprime=gradient, args=(X, y))

There is a ValueError through out:

Cell In[5], line 4, in h(X, theta)
      3 def h(X,theta):
----> 4     return sigmoid(X@theta)

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 100 is different from 3)

How can I resolve this?

Solution

It seems that fmin_tnc gives the x0 parameter as first parameter to the function it tries to optimize. To solve this, just use theta as first parameter for both calculate_cost and gradient.
I also fixed some missmatches which crashed the code for me (return 1/m*(first-second)[0][0] -> return 1/m*(first-second)[0] and 1/m*(X.T@(h(X,theta)-y)).T, axis=0 -> np.mean(1/m*(X.T@(h(X,theta)-y)).T, axis=0)), but you'll have to take a look if it is useful to e.g. take the mean for the gradient.

import numpy as np
from scipy import optimize as opt

data = np.random.rand(100, 3)

m,n=data.shape
X=data[:,0:n-1]
y=data[:,-1:]
theta=np.array([[0] for i in range(n)],dtype=np.float64)
print(X.shape,y.shape,theta.shape)

# OUT:(100, 2) (100, 1) (3, 1)

X=np.concatenate((np.array([[1] for i in range(m)]),X),axis=1)

def sigmoid(z):
    return 1/(1+np.exp(-z))
def h(X,theta):
    return sigmoid(X@theta)
def calculate_cost(theta, X,y):
    first=-y.T @ np.log(h(X,theta)+1e-5)  
    second=(1-y).T@np.log(1-h(X,theta)+1e-5)   
    return 1/m*(first-second)[0]
def gradient(theta, X,y):
    return np.mean(1/m*(X.T@(h(X,theta)-y)).T, axis=0)

result = opt.fmin_tnc(func=calculate_cost, x0=theta, fprime=gradient, args=(X, y))