I'm following this tutorial (section 6: Tying it All Together), with my own dataset. I can get the example in the tutorial working, no problem, with the sample dataset provided.
I'm getting a binary cross-entropy error that is negative, and no improvements as epochs progress. I'm pretty sure binary cross-entropy should always be positive, and I should see some improvement in the loss. I've truncated the sample output (and code call) below to 5 epochs. Others seem to run into similar problems sometimes when training CNNs, but I didn't see a clear solution in my case. Does anyone know why this is happening?
Sample output:
Creating TensorFlow device (/gpu:2) -> (device: 2, name: GeForce GTX TITAN Black, pci bus id: 0000:84:00.0)
10240/10240 [==============================] - 2s - loss: -5.5378 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000
Epoch 2/5
10240/10240 [==============================] - 0s - loss: -7.9712 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000
Epoch 3/5
10240/10240 [==============================] - 0s - loss: -7.9712 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000
Epoch 4/5
10240/10240 [==============================] - 0s - loss: -7.9712 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000
Epoch 5/5
10240/10240 [==============================] - 0s - loss: -7.9712 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000
My code:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
dataset = np.loadtxt('train_rows.csv', delimiter=",")
testset = np.loadtxt('test_rows.csv', delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, :62]
Y = dataset[:, 62]
X_test = testset[:, :62]
Y_test = testset[:, 62]
### create model
model = Sequential()
model.add(Dense(100, input_dim=(62,), activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
## Fit the model
model.fit(X, Y, validation_data=(X_test, Y_test), epochs=5, batch_size=128)
I should have printed out my response variable. The categories were labelled as 1 and 2 instead of 0 and 1, which confused the classifier.