Right now i'm trying to solve the tasks from the ML course by Andrew Ng. I'm using python to do the logistic regression, but the result is quite confusing for me. And there's something wrong when using scipy.opt function
I'm asked to do classification by using logistic regression. The functions for calculating cost and gradient are:
So instead of using opt function provided in scipy, i want to insist on gradient-descent to update the parameter, as taught in the previous tutorial.
But when i use different learning rate(alpha), the cost changes differently.
Here's the result when alpha=0.0001,iteration=400000
And...Here's the result when alpha=0.01,iteration=200000
Seems like this chaos-like result is hard to explained for me, although the final result is acceptable. Is there any explanation for this situation?
And something went wrong when using scipy.opt Seems like this chaos-like result is hard to explained for me, although the final result is acceptable. Is there any explanation for this situation?
m,n=data.shape
X=data[:,0:n-1]
y=data[:,-1:]
theta=np.array([[0] for i in range(n)],dtype=np.float64)
print(X.shape,y.shape,theta.shape)
# OUT:(100, 2) (100, 1) (3, 1)
X=np.concatenate((np.array([[1] for i in range(m)]),X),axis=1)
def sigmoid(z):
return 1/(1+np.exp(-z))
def h(X,theta):
return sigmoid(X@theta)
def calculate_cost(X,y,theta):
first=-y.T @ np.log(h(X,theta)+1e-5)
second=(1-y).T@np.log(1-h(X,theta)+1e-5)
return 1/m*(first-second)[0][0]
def gradient(X,y,theta):
return 1/m*(X.T@(h(X,theta)-y)).T
result = opt.fmin_tnc(func=calculate_cost, x0=theta, fprime=gradient, args=(X, y))
There is a ValueError through out:
Cell In[5], line 4, in h(X, theta)
3 def h(X,theta):
----> 4 return sigmoid(X@theta)
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 100 is different from 3)
How can I resolve this?
It seems that fmin_tnc
gives the x0
parameter as first parameter to the function it tries to optimize. To solve this, just use theta
as first parameter for both calculate_cost
and gradient
.
I also fixed some missmatches which crashed the code for me (return 1/m*(first-second)[0][0]
-> return 1/m*(first-second)[0]
and 1/m*(X.T@(h(X,theta)-y)).T, axis=0
-> np.mean(1/m*(X.T@(h(X,theta)-y)).T, axis=0)
), but you'll have to take a look if it is useful to e.g. take the mean for the gradient.
import numpy as np
from scipy import optimize as opt
data = np.random.rand(100, 3)
m,n=data.shape
X=data[:,0:n-1]
y=data[:,-1:]
theta=np.array([[0] for i in range(n)],dtype=np.float64)
print(X.shape,y.shape,theta.shape)
# OUT:(100, 2) (100, 1) (3, 1)
X=np.concatenate((np.array([[1] for i in range(m)]),X),axis=1)
def sigmoid(z):
return 1/(1+np.exp(-z))
def h(X,theta):
return sigmoid(X@theta)
def calculate_cost(theta, X,y):
first=-y.T @ np.log(h(X,theta)+1e-5)
second=(1-y).T@np.log(1-h(X,theta)+1e-5)
return 1/m*(first-second)[0]
def gradient(theta, X,y):
return np.mean(1/m*(X.T@(h(X,theta)-y)).T, axis=0)
result = opt.fmin_tnc(func=calculate_cost, x0=theta, fprime=gradient, args=(X, y))