Search code examples
pythonpandasneural-networksklearn-pandas

How to view the rows marked as False Positive and False Negative from confusion matrix


I have created a very simple Artificial Neural Network in Python. In the example below I get an accuracy based off of values in a confusion matrix. These are results of the confusion matrix:

array([[3990,    2],
       [  56,  172]])

I am interested in finding the rows where it was marked as false positive(2) and false negative(56).

The following is my code:

#Import the dataset
X = DBF2.iloc[:, 1:2].values
y = DBF2.iloc[:, 2].values

#Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X_1 = LabelEncoder()
X[:, 0] = labelencoder_X_1.fit_transform(X[:, 0])

#Create dummy variables
onehotencoder = OneHotEncoder(categorical_features = [0])
X = onehotencoder.fit_transform(X).toarray()
#Remove 2 variables to avoid falling into the dummy variable trap
X = np.delete(X, [0], axis=1)

#Splitting the dataset
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, 
random_state = 0)

#Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

#Make the ANN
import keras
from keras.models import Sequential
from keras.layers import Dense

#Initialising the ANN
classifier = Sequential()

#Adding the input layer and the first hidden layer
classifier.add(Dense(units=200, kernel_initializer='uniform', activation='relu', input_dim=400))

#Adding a second hidden layer
classifier.add(Dense(units=200, kernel_initializer='uniform', 
activation='relu'))

#Adding the output layer
classifier.add(Dense(units=1, kernel_initializer='uniform', activation='sigmoid'))

#Compiling the ANN
classifier.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

#Fitting the ANN to the training set
classifier.fit(X_train, y_train, batch_size=10, epochs=20)                                

#Predicting the Test set results
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)

#Making the confusion matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)

#Find accuracy of test set
TruePos = cm[0,0]
FalsePos = cm[0,1]
TrueNeg = cm[1,1]
FalseNeg = cm[1,0]

accuracy = float(TruePos + TrueNeg) / float(TruePos + FalsePos + TrueNeg + FalseNeg)
accuracy = accuracy*100
print "Test Accuracy: " ,accuracy

Solution

  • In order to do that, you can use a mask on ypred and ytest:

    X_test[(y_test == 1) & (y_pred[:,0].T == 0)]
    X_test[(y_test == 0) & (y_pred[:,0].T == 1)]
    

    Or if you don't care about separating FN from FP:

    X_test[(y_test != y_pred[:,0].T).T]