I know that adding a dropout layer into a CNN model enhances accuracy, since it decrease the impact of over-fitting. However, I built a CNN model with 16,32 and 64 filters, size 3 and maxpool of 2 and noticed that the model without the dropout layer performed better than the model with a dropout layer for all cases.
from keras.models import Sequential
from keras.layers import Conv2D,Activation,MaxPooling2D,Dense,Flatten,Dropout
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from IPython.display import display
import matplotlib.pyplot as plt
from PIL import Image
from sklearn.metrics import classification_report, confusion_matrix
import keras
from keras.layers import BatchNormalization
from keras.optimizers import Adam
import pickle
classifier = Sequential()
classifier.add(Conv2D(16,(3,3),input_shape=(200,200,3)))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size =(2,2)))
classifier.add(Flatten())
classifier.add(Dense(128))
classifier.add(Activation('relu'))
classifier.add(Dropout(0.5))
classifier.add(Dense(7))
classifier.add(Activation('softmax'))
classifier.summary()
classifier.compile(optimizer =keras.optimizers.Adam(lr=0.001),
loss ='categorical_crossentropy',
metrics =['accuracy'])
train_datagen = ImageDataGenerator(rescale =1./255,
shear_range =0.2,
zoom_range = 0.2,
horizontal_flip =True)
test_datagen = ImageDataGenerator(rescale = 1./255)
batchsize=10
training_set = train_datagen.flow_from_directory('/home/osboxes/Downloads/Downloads/Journal_Paper/Malware_Families/Spectrogram/Train/',
target_size=(200,200),
batch_size= batchsize,
class_mode='categorical')
test_set = test_datagen.flow_from_directory('/home/osboxes/Downloads/Downloads/Journal_Paper/Malware_Families/Spectrogram/Validate/',
target_size = (200,200),
batch_size = batchsize,
shuffle=False,
class_mode ='categorical')
history=classifier.fit_generator(training_set,
steps_per_epoch = 2340 // batchsize,
epochs = 100,
validation_data =test_set,
validation_steps = 781 // batchsize)
classifier.save('16_With_Dropout_rl_001.h5')
with open('16_With_Dropout_rl_001.h5', 'wb') as file_pi:
pickle.dump(history.history, file_pi)
Y_pred = classifier.predict_generator(test_set, steps= 781 // batchsize+1)
y_pred = np.argmax(Y_pred, axis=1)
print('Confusion Matrix')
print(confusion_matrix(test_set.classes, y_pred))
print('Classification Report')
target_names = test_set.classes
class_labels = list(test_set.class_indices.keys())
target_names = ['coinhive','emotet','fareit','gafgyt','mirai','ramnit','razy']
report = classification_report(test_set.classes, y_pred, target_names=class_labels)
print(report)
# summarize history for accuracy
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy 16 with dropout rl .001')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss 16 with dropout rl .001')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
I know that adding a dropout layer into a CNN model enhances accuracy, since it decrease the impact of over-fitting.
You can put it that way but it doesn't hold in general. Dropout layer is a generalization technique that decreases flexibility of your model which can prevent overfitting assuming that your model is flexible enough to handle the task (actually, assuming that your model is more flexible than needed). If your model is not capable of handling the task to begin with, meaning that it is too weak, then adding any kind of regularization will probably only worsen its performance.
That being said, CNN's usually perform better when you include more than just one convolutional layer. The idea is that deeper convolutional layers learn more complex features while the layers close to the input learn just a basic shapes (of course, this depends on the structure of the network itself and on the complexity of the task). And since you usually want to include more convolutional layers, the complexity (and flexibility) of such model raises which can lead to overfitting, therefore the need for regularization techniques. (3 convolutional layers with regularization will usually outperform one convolutional layer without regularization).
Your design only includes one convolutional layer. I would suggest stacking multiple convolutional/pooling layers on top of each other and adding some dropout layers to fight the overfitting if necessary (it is probably going to be hard to see any positive effects of regularization on such a simple model).