I am working on my own dataset which is stored in a csv file. It has three columns: val1 | val2 | label. There are total of 6 labels. The number of rows and columns are 2000 and 3 respectively. I want to create a 1D CNN network that takes input val1 and val2 and can predict the label. So far I have tried
df = pd.read_csv("data.csv")
x = df.drop(["label"], axis=1) #x.shape = (2000, 2)
x = np.expand_dims(x,-1) #x.shape = (2000, 2, 1)
y = df.label #y.shape = (2000, 1)
y = to_categorical(y) #y.shape = (2000, 6)
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
X_train, X_valid, y_train, y_valid = train_test_split(X_train, y_train, test_size=0.2)
model = Sequential()
model.add(Conv1D(filters=256, kernel_size=2, activation='relu', input_shape=(2,1)))
model.add(Dropout(0.2))
model.add(MaxPooling1D(pool_size=1))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(6, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train,
batch_size=64,
epochs=100,
verbose=1,
validation_data=(X_valid,y_valid),
shuffle=True,
)
The above model gives validation and training accuracy of maximum 30% only.
Things that i tried: Data augmentation. Changing the number of filters. Increasing the number of layers.
How can i increase the accuracy of the model ?
There are plenty of options you can try:
This is not an exhaustive list. The first step would definitely be to check your data. Your result that both training and validation performance are low suggests that you are underfitting. That would suggest that your model is too small or too heavily regularized (Dropout). I rather feel like your model is too large and too complex, but that will depend on your task. Give logistic regression, an SVM or an FCNN a shot. If it turns out that your task is indeed very complex, try to gather more data or infer more structure of your problem.