Search code examples
pythonnumpygradient-descent

Gradient Descent returning nan in output


I have a data having 3 features and 1 target variable. I am trying to use gradient descent and later minimize the RMSE

While trying to run the code, I am getting nan as the cost/error term Tried a lot of methods but can't figure it out.

Can anyone please tell me where I am going wrong with the calculation. Here's the code: m = len(y)

# calculate gradient
def grad(theta):
    
    dJ = 1/m*np.sum((Xnorm.dot(theta)-ynorm.reshape(len(ynorm),1))*Xnorm,axis=0).reshape(-1,1)
    return dJ

def cost(theta):
    J = np.sum((Xnorm.dot(theta)-ynorm.reshape(len(ynorm),1))**2,axis=0)
    return J

def GD(theta0,learning_rate = 0.0005,epochs=500,TOL=1e-1):
    
    theta_history = [theta0]
    J_history = [cost(theta0)]
    print(J_history)
    
    thetanew = theta0*10000
#     print(f'epoch \t Cost(J) \t')
    for epoch in range(epochs):
        if epoch%100 == 0:
            print('epoch', epoch, 'cost',J_history[-1])
        dJ = grad(theta0)
        J = cost(theta0)
        
        thetanew = theta0 - learning_rate*dJ
        theta_history.append(thetanew)
        J_history.append(J)
        
        if np.sum((thetanew - theta0)**2) < TOL:
            print('Convergence achieved.')
            break
        theta0 = thetanew

    return thetanew,theta_history,J_history

Even for the first theta value, it returns nan

theta,theta_history,J_history = GD(theta0)

enter image description here

Shape of my variables

enter image description here


Solution

  • The only reasonable solution which we came up was since the cost was high.. it was not possible to use this approach for this solution. We tried using a different approach like simple linear regression and it worked.