python-3.x numpy tensorflow neural-network tflearn

tflearn DNN gives zero loss

I am using pandas to extract my data. To get an idea of my data I replicated an example dataset...

data = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))

which yields a dataset of shape=(100,4)...

    A   B   C   D
0  75  38  81  58
1  36  92  80  79
2  22  40  19  3
   ...    ...

I am using tflearn so I will need a target label as well. So I created a target label by extracting one of the columns from data and then dropped it out of the data variable (I also converted everything to numpy arrays)...

# Target label used for training
labels = np.array(data['A'].values, dtype=np.float32)

# Reshape target label from (100,) to (100, 1)
labels = np.reshape(labels, (-1, 1))

# Data for training minus the target label.
data = np.array(data.drop('A', axis=1).values, dtype=np.float32)

Then I take the data and the labels and feed it into the DNN...

# Deep Neural Network.    
net = tflearn.input_data(shape=[None, 3])
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 1, activation='softmax')
net = tflearn.regression(net)

# Define model.
model = tflearn.DNN(net)
model.fit(data, labels, n_epoch=10, batch_size=16, show_metric=True)

This seems like it should work, but the output I get is as follows...

Notice that the loss remains at 0, so I am definitely doing something wrong. I don't really know what form my data should be in. How can I get my training to work?

Solution

Your actual output is in range 0 to 100 while the activation softmax in the outermost layer outputs in range [0, 1]. You need to fix that. Also the default loss for tflearn.regression is categorical cross entropy which is used for classification problems and makes no sense in your scenario. You should try L2 loss. The reason you are getting zero error in this setting is that your network predicts 0 for all training examples and if you fit that value in formula for sigmoid cross entropy, loss indeed is zero. Here is its formula , where t[i] denotes the actual probabilities (which doesnt make sense in your problem) and o[i] is the predicted probabilities.

$loss(o, t) = - 1/n \sum_i (t[i] * log(sigmoid(o[i])) + (1 - t[i]) * log(1 - sigmoid(o[i])))$

Here is more reasoning about why default choice of loss function is not suitable for your case