I wanted to try to implement gradient descent by myself and I wrote this:
# Creating random sample dataset
import random as rnd
dataset = []
for i in range(0, 500):
d_dataset = [i, rnd.randint((i-4), (i+4))]
dataset.append(d_dataset)
def gradient_descent(t0, t1, lrate, ds):
length = len(ds)
c0, c1 = 0, 0
for element in ds:
elx = element[0]
ely = element[1]
c0 += ((t0 + (t1*elx) - ely))
c1 += ((t0 + (t1*elx) - ely)*elx)
t0 -= (lrate * c0 / length)
t1 -= (lrate * c1 / length)
return t0, t1
def train(t0, t1, lrate, trainlimit, trainingset):
k = 0
while k < trainlimit:
new_t0, new_t1 = gradient_descent(t0, t1, lrate, trainingset)
t0, t1 = new_t0, new_t1
k += 1
return t0, t1
print(gradient_descent(20, 1, 1, dataset))
print(train(0, 0, 1, 10000, dataset))
Whenever I run this, I get a somewhat normal output from the gradient_descent()
but I get (nan, nan)
from the train()
function. I tried running train
with the input (0, 0, 1, 10, dataset)
and I get this value (-4.705770241957691e+46, -1.5670167612541557e+49)
, which seems very wrong.
Please tell me what I'm doing wrong and how to fix this error. Sorry if this has been asked before but I couldn't find any answers on how to fix nan error.
When calling print(train(0, 0, 1, 10000, dataset))
, the values returned by gradient_descent(t0, t1, lrate, trainingset)
are increasing in every iteration of the while
-loop. When they become larger than the maximum value allowed for float
, they will automatically be converted to float('inf')
, a float
representing infinity. Check this maximum value on your system with sys.float_info.max
:
import sys
print(sys.float_info.max)
However, your function gradient_descent()
can't handle infinite values, which you can verify with the following call to your function:
gradient_descent(float('inf'), float('inf'), 1, dataset)
The problem here are the following two lines in gradient_descent()
, which are not well defined for t0
and t1
being infinite:
c0 += ((t0 + (t1*elx) - ely))
c1 += ((t0 + (t1*elx) - ely)*elx)