I've been trying to make a univariate logistic regression model in python using tensorflow that I learnt in Matlab (ML course on Coursera by Andrew ng). The model converges but only when the initial theta0 nad theta1 variables are defined small(about 1.00) but returns the converging values as nan if initial value is set to 100.00. Also the same thing happens when learning rate is increased. The python code is
import tensorflow as tf
import numpy as np
import os
import matplotlib.pyplot as plt
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
epoch = 100000
x_data = (np.random.rand(100)*100).astype(np.float64)
y_data = np.vectorize(lambda y: 0.00 if y < 50.00 else 1.00)(x_data)
theta0 = tf.Variable(1, dtype=tf.float64)
theta1 = tf.Variable(-1, dtype=tf.float64)
hypothesis = theta0 + (theta1 * x_data)
hypothesis = tf.sigmoid(hypothesis)
term1 = -(y_data * tf.log(hypothesis))
term2 = -((1-y_data) * tf.log(1-hypothesis))
loss = tf.reduce_mean(term1 + term2)
optimizer = tf.train.GradientDescentOptimizer(0.006).minimize(loss)
init_var = tf.global_variables_initializer()
train_data = []
with tf.Session() as sess:
sess.run(init_var)
for i in range(epoch):
train_data.append(sess.run([optimizer, theta0, theta1, loss])[1:])
if i%100==0:
print("Epoch ", i, ":", sess.run([theta0, theta1, loss]))
Explanations for the described behavior of code and corrections, or even a better code for above purpose would be deeply appreciated.
You should be using tf.nn.sigmoid_cross_entropy_with_logits
instead of taking the sigmoid and then doing a log to compute the loss. The sigmoid_cross_entropy_with_logits has some internal logic to help prevent numerical underflow/overflow.