Search code examples
pythontensorflowkerasneural-networkdropout

How to improve neural network through dropout layers?


I am working on a neural network that predicts heart disease. The data comes from kaggle and has been pre-processed. I have used various models, such as logistic regression, random forests, and SVM, which all produce solid results. I'm trying to use the same data for a neural network, to see whether a NN can outperform the other ML models (the data set is rather small, which may explain the poor results). Below is my code for the network. The model below produces 50% accuracy, which, obviously, is too low to be useful. From what you can tell, does anything look off that would undermine the accuracy of the model?

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from tensorflow.keras.layers import Dense, Dropout
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.callbacks import EarlyStopping

df = pd.read_csv(r"C:\Users\***\Desktop\heart.csv")

X = df[['age','sex','cp','trestbps','chol','fbs','restecg','thalach']].values
y = df['target'].values

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

scaler.fit_transform(X_train)
scaler.transform(X_test)


nn = tf.keras.Sequential()

nn.add(Dense(30, activation='relu'))

nn.add(Dropout(0.2))

nn.add(Dense(15, activation='relu'))

nn.add(Dropout(0.2))


nn.add(Dense(1, activation='sigmoid'))


nn.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics= 
 ['accuracy'])


early_stop = EarlyStopping(monitor='val_loss',mode='min', verbose=1, 
patience=25)

nn.fit(X_train, y_train, epochs = 1000, validation_data=(X_test, y_test),
     callbacks=[early_stop])

model_loss = pd.DataFrame(nn.history.history)
model_loss.plot()

predictions = nn.predict_classes(X_test)

from sklearn.metrics import classification_report,confusion_matrix

print(classification_report(y_test,predictions))
print(confusion_matrix(y_test,predictions))

Solution

  • The scaler is not in-place; you need to save the scaled results.

    X_train = scaler.fit_transform(X_train)
    X_test = scaler.transform(X_test)
    

    You'll then get results more in line with what you were expecting.

                  precision    recall  f1-score   support
    
               0       0.93      0.98      0.95       144
               1       0.98      0.93      0.96       164
    
        accuracy                           0.95       308
       macro avg       0.95      0.96      0.95       308
    weighted avg       0.96      0.95      0.95       308