i'm a beginner in the argument. I have this problem: I have to classify the percentage of 2 class in each frame of a video. I created a small dataset with about 500 images (250 of each class), and a CNN with these layers:
model = tf.models.Sequential()
model.add(tf.layers.Conv2D(32, kernel_size=(3, 3), activation='relu',input_shape=(224,224,3)))
model.add(tf.layers.MaxPooling2D((2, 2)))
model.add(tf.layers.Conv2D(64, (3, 3), activation='relu'))
model.add(tf.layers.MaxPooling2D((2, 2)))
model.add(tf.layers.Conv2D(128, kernel_size=(3, 3), activation='relu'))
model.add(tf.layers.MaxPooling2D((2, 2)))
model.add(tf.layers.Conv2D(256, kernel_size=(3, 3), activation='relu'))
model.add(tf.layers.MaxPooling2D((2, 2)))
model.add(tf.layers.Flatten())
model.add(tf.layers.Dense(512, activation='relu'))
model.add(tf.layers.Dropout(0.2))
model.add(tf.layers.Dense(2,activation='sigmoid'))
model.summary()
model.compile(loss='binary_crossentropy', optimizer=tf.optimizers.Adam(learning_rate=0.00001), metrics=['accuracy'])
1)It's better for the problem use binary_crossentropy + sigmoid or binary_crossentropy + softmax?
2)Then it's better to use transfer learning/fine tuning or build CNN from scratch like this?
3)I'm using ImageDataGenerator for DataAugmentation because the small dataset, it's right?
4)Which values I can use for batch_size, steps_per_epochs,learning_rate...I noticed that the model accuracy goes early to 1.0 with val_accuracy, and in the predictions doesn't return the correct percentage of each class, but return values like [9.999e-1 4.444e-5]
np.argmax
of the prediction, in the example you have given, pred = [9.999e-1 4.444e-5]
, the predicted class is 0, as pred[0] > pred[1].