Search code examples
pythonmatlabtensorflowlogistic-regressionsigmoid

tf.sigmoid() overflow when used for bigger values


I've been trying to make a univariate logistic regression model in python using tensorflow that I learnt in Matlab (ML course on Coursera by Andrew ng). The model converges but only when the initial theta0 nad theta1 variables are defined small(about 1.00) but returns the converging values as nan if initial value is set to 100.00. Also the same thing happens when learning rate is increased. The python code is

import tensorflow as tf
import numpy as np
import os
import matplotlib.pyplot as plt


os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
epoch = 100000


x_data = (np.random.rand(100)*100).astype(np.float64)
y_data = np.vectorize(lambda y: 0.00 if y < 50.00 else 1.00)(x_data)

theta0 = tf.Variable(1, dtype=tf.float64)
theta1 = tf.Variable(-1, dtype=tf.float64)

hypothesis = theta0 + (theta1 * x_data)
hypothesis = tf.sigmoid(hypothesis)

term1 = -(y_data * tf.log(hypothesis))
term2 = -((1-y_data) * tf.log(1-hypothesis))

loss = tf.reduce_mean(term1 + term2)

optimizer = tf.train.GradientDescentOptimizer(0.006).minimize(loss)
init_var = tf.global_variables_initializer()

train_data = []
with tf.Session() as sess:
    sess.run(init_var)
    for i in range(epoch):
        train_data.append(sess.run([optimizer, theta0, theta1, loss])[1:])
        if i%100==0:
            print("Epoch ", i, ":", sess.run([theta0, theta1, loss]))

Explanations for the described behavior of code and corrections, or even a better code for above purpose would be deeply appreciated.


Solution

  • You should be using tf.nn.sigmoid_cross_entropy_with_logits instead of taking the sigmoid and then doing a log to compute the loss. The sigmoid_cross_entropy_with_logits has some internal logic to help prevent numerical underflow/overflow.