python numpy machine-learning tensorflow logistic-regression

tensorflow simple logistic regression

Hello

I just want to try binary classification with simple logistic regression.I've got unlabeled output data as {1,0} // (He/she passed exam or not) cost function returns (NaN).What is wrong?

learning_rate = 0.05
total_iterator = 1500
display_per = 100

data = numpy.loadtxt("ex2data1.txt",dtype=numpy.float32,delimiter=",");

training_X = numpy.asarray(data[:,[0,1]]) # 100 x 2

training_X contains 100 x 2 matrix as the exam scores.e.g [98.771 4.817]

training_Y = numpy.asarray(data[:,[2]],dtype=numpy.int) # 100 x 1

training_Y contains 100x1 array as, [1] [0] [0] [1] i can't write line by line due to stackoverflow format

m = data.shape[0]

x_i = tf.placeholder(tf.float32,[None,2]) # None x 2                        
y_i = tf.placeholder(tf.float32,[None,1]) # None x 1                       

W = tf.Variable(tf.zeros([2,1]))  # 2 x 1 
b = tf.Variable(tf.zeros([1]))  # 1 x 1 

h = tf.nn.softmax(tf.matmul(x_i,W)+b)

cost = tf.reduce_sum(tf.add(tf.multiply(y_i,tf.log(h)),tf.multiply(1-
y_i,tf.log(1-h)))) / -m

i tried to use simple logistic cost function.it got returned 'NaN'.i thought my cost function is totally garbarage,got used tensorflow's example's cost function:

 cost = tf.reduce_mean(-tf.reduce_sum(y_i*tf.log(h), reduction_indices=1))

but it didn't worked as well.

initializer= tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)

    print("cost: ", sess.run(cost, feed_dict={x_i:training_X, 
    y_i:training_Y}), "w: ", sess.run(W),"b: ", sess.run(b))

Solution

The function tf.nn.softmax expects the number of logits (last dimension) to be equal the number of classes (2 in your case {1,0}). since the last dimension in your case is 1, softmax will always return 1 (the probability of being in the only available class is always 1 since no other class exists). therefore h is a tensor filled with 1's and tf.log(1-h) will return negative infinity. Infinity multiplied by zero (1-y_i in some rows) returns NaN.

You should replace tf.nn.softmax with tf.nn.sigmoid.

A possible fix is:

h = tf.nn.sigmoid(tf.matmul(x_i,W)+b)
cost = tf.reduce_sum(tf.add(tf.multiply(y_i,tf.log(h)),tf.multiply(1-
y_i,tf.log(1-h)))) / -m

or better, you can use tf.sigmoid_cross_entropy_with_logits
in that case, it should be done as follows:

h = tf.matmul(x_i,W)+b
cost = tf.reduce_mean(tf.sigmoid_cross_entropy_with_logits(labels=y_i, logits=h))

this function is more numerically stable than using tf.nn.sigmoid followed by the cross_entropy function which can return a NaN if tf.nn.sigmoid gets near 0 or 1 due to the imprecision of float32.