tensorflow neural-network conv-neural-network training-data

Binary classification not training correctly

I've been working on a neural network that can classify two sets of astronomical data. I believe that my neural network is struggling because the two sets of data are quite similar, but even with significant changes to the data, it still seems like the accuracy history doesn't behave how I think it would.

These are example images from each dataset:

I'm currently using 10,000 images of each type, with 20% going to validation data, so 16,000 training images and 4,000 validation images. Due to memory constraints, I can't increase the datasets much more than this.

This is my current model:

model.add(layers.Conv2D(64, (3, 3), padding="valid", activation='relu', input_shape=(192, 192, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (7, 7), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (9, 9), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (7, 7), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(2, activation="sigmoid"))

which I'm compiling with:

opt = SGD(lr=0.1)
model.compile(optimizer=opt,
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

and fitting using:

history = model.fit(train, train_labels, batch_size=200, epochs=15, validation_data=(validation, validation_labels))

If I add something to the data to make the datasets unrealistically different (e.g. adding a random rectangle to the middle of the data or by adding a mask to one but not the other), I get an accuracy history that looks like this:

(Note that the accuracy history for the training data was shifted half an epoch to the left to account for the training accuracy being measured, on average, half an epoch before the validation accuracy.)

If I make the datasets very similar (e.g. adding nothing to the datasets or applying the same mask to both), I get an accuracy history that looks like this:

or occasionally with a big spike in validation accuracy for one epoch, like so:

Looking at different websites and other StackOverflow pages, I've tried:

changing the number and size of the filters
adding or subtracting convolutional layers
changing the optimizer function (it was originally "adam", so it had an adaptive learning rate and I switched it to the above so I could manually tune the learning rate)
increasing the batch size
increasing the dataset (originally had only 5,000 images of each instead of 10,000),
increasing the number of epochs (from 10 to 15)
adding or subtracting padding from the convolutional layer
changing the activation function in the last layer

Am I missing something? Are these dataset just too similar to achieve a binary classifcation network?

Solution

If this is a binary classification then you need to change:

model.add(layers.Dense(2, activation="sigmoid"))

into:

model.add(layers.Dense(1, activation="sigmoid"))

Sigmoid indicates that if the output is bigger than some threshold(most of the times 0.5) then it belongs to second class etc. Also you really don't need to use from_logits = True since you specified an activation in the last dense layer.

Recall that your loss also should be:

tf.keras.losses.BinaryCrossentropy(from_logits = False)

If you want to set from_logits = True, then your last dense layer should look like this:

model.add(layers.Dense(1)) # no activation, linear.

You can also use 2 neurons in the last dense layer but then you need to use softmax activation with categorical loss.