python machine-learning linear-regression gradient-descent

Linear regression using gradient descent; having trouble with cost function value

I'm coding linear regression by using gradient descent. By using for loop not tensor.

I think my code is logically right, and when I plot the graph theta value and linear model seems to be coming out good. But the value of cost function is high. Can you help me?

The value of cost function is 1,160,934 which is abnormal.

def gradient_descent(alpha,x,y,ep=0.0001, max_repeat=10000000):
    m = x.shape[0]
    converged = False
    repeat = 0
    theta0 = 1.0
    theta3 = -1.0
#     J=sum([(theta0 +theta3*x[i]- y[i])**2 for i in range(m)]) / 2*m #######
    J=1
    
    while not converged : 
        grad0= sum([(theta0 +theta3*x[i]-y[i]) for i in range (m)]) / m
        grad1= sum([(theta0 + theta3*x[i]-y[i])*x[i] for i in range (m)])/ m
        
        temp0 = theta0 - alpha*grad0
        temp1 = theta3 - alpha*grad1
        
        theta0 = temp0
        theta3 = temp1
    
        msqe = (sum([(theta0 + theta3*x[i] - y[i]) **2 for i in range(m)]))* (1 / 2*m)
        print(theta0,theta3,msqe)
        if abs(J-msqe) <= ep:
            print ('Converged, iterations: {0}', repeat, '!!!')
            converged = True
        
        J = msqe
        repeat += 1
        
        if repeat == max_repeat:
                converged = True
                print("max 까지 갔다")
   
    return theta0, theta3, J
[theta0,theta3,J]=gradient_descent(0.001,X3,Y,ep=0.0000001,max_repeat=1000000)

print("************\n theta0 : {0}\ntheta3 : {1}\nJ : {2}\n"
          .format(theta0,theta3,J))

This is the data set.

Solution

I think the dataset itself is quite widespread and that's why the best fit line shows a large amount for the cost function. If you scale your data - you would see it drop significantly.