How to create a confusion matrix with pandas.crosstab when all the predicted values are 1?

I am learning about performance metrics. I have a dataframe with 0-10099 rows and with two columns (Y_Actual, Y_Predicted). I would like to create a confusion matrix with pandas.

My first attempt:

y_actual= df5a["y"]
y_actual= y_actual.rename("Actual")
y_predicted=df5a["labels"]
y_predicted= y_predicted.rename("Predicted")
confusion_matrix_5a= pd.crosstab(y_actual, y_predicted)
confusion_matrix_5a

output1:

Predicted   1
Actual  
0.0        100
1.0        10000

After checking all my Y_Predicted, I realized that all the values were "1". To get pandas.crosstab() to create the matrix in this situation, I added an extra row to my dataframe (Y_actual=0, Y_predicted= 1).

output2:

Predicted   0   1
Actual      
0.0         1   100
1.0         0   10000

The real confusion matrix should be:

Predicted   0   1
Actual      
0.0         0   100
1.0         0   10000

The "1" in output2 is there because I added the extra row. I know this will not affect my accuracy because I have many rows, so the effect of adding the row will be negligible. Do you know any other way to create the matrix with pandas.crosstab() when you have a unique value in one of the columns? Any suggestions about how to do it without adding the extra row?

Solution

crosstab picks up values present in the columns, so you need to populate the missing column manually. A simple way to do that is reindex.

Let's say conf_mat is your confusion matrix with only one column.

Then you can do conf_mat.reindex([0,1], axis = 'columns', fill_value = 0) to force the dataframe to hold columns with names 0 and 1.

Reference: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reindex.html