Here are my results from the confusionMatrix() function in R, this is based on a Zero-R model. I may have setup the function incorrectly, according to its results there's a mismatch between what I manually got as the answer varied by randomized seeds and the confusionMatrix() function's answer of sensitivity just being 1.0000:
> sensitivity1 = 213/(213+128)
> sensitivity2 = 211/(211+130)
> sensitivity3 = 215/(215+126)
> #specificity = 0/(0+0) there were no other predictions
> specificity = 0
> specificity
[1] 0
> sensitivity1
[1] 0.6246334
> sensitivity2
[1] 0.6187683
> sensitivity3
[1] 0.6304985
There is a warning message but it does look like it still runs and refactors the data to match because it wasn't in the same order, this may be based on train and test ordering and randomization. I attempted to go back and make sure the train and test didn't have reverse ordering with the negative sign, or different numbers of rows. Here's the results from caret's confusionMatrix() function:
> confusionMatrix(as.factor(testDiagnosisPred), as.factor(testDiagnosis), positive="B")
Confusion Matrix and Statistics
Prediction B M
B 211 130
M 0 0
Accuracy : 0.6188
95% CI : (0.5649, 0.6706)
No Information Rate : 0.6188
P-Value [Acc > NIR] : 0.524
Kappa : 0
Mcnemar's Test P-Value : <2e-16
Sensitivity : 1.0000
Specificity : 0.0000
Pos Pred Value : 0.6188
Neg Pred Value : NaN
Prevalence : 0.6188
Detection Rate : 0.6188
Detection Prevalence : 1.0000
Balanced Accuracy : 0.5000
'Positive' Class : B
Warning message:
In confusionMatrix.default(as.factor(testDiagnosisPred), as.factor(testDiagnosis), :
Levels are not in the same order for reference and data. Refactoring data to match.
The testDiagnosisPred just shows that it guesses Benign (B) as the diagnosis for every cancer test in the data set, these vary based on seed because actual Benign (B) and Malignant (M) results get randomized each time.
> ## testDiagnosisPred
> ## B
> ## 228
> majorityClass # confusion matrix
211 130
> ##
> ## B M
> ## 213 128
> # another seed's confusion matrix
> ## B M
> ## 211 130
Here's what some of the data looks like using the head() and str() functions:
> head(testDiagnosisPred)
[1] "B" "B" "B" "B" "B" "B"
> head(cancerdata.train$Diagnosis)
[1] "B" "B" "M" "M" "M" "B"
> head(testDiagnosis)
[1] "B" "B" "M" "M" "M" "B"
> str(testDiagnosisPred)
chr [1:341] "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" ...
> str(cancerdata.train$Diagnosis)
chr [1:341] "B" "B" "M" "M" "M" "B" "B" "B" "M" "M" "M" "B" "M" "M" "B" "M" "B" "B" "B" "M" "B" "B" "B" "B" ...
> str(testDiagnosis)
chr [1:341] "B" "B" "M" "M" "M" "B" "B" "B" "M" "M" "M" "B" "M" "M" "B" "M" "B" "B" "B" "M" "B" "B" "B" "B" ...
The confusion with the confusion matrix and the calculations of specificity and sensitivity occurred because of misreading the confusion matrix horizontally instead of vertically, the correct answer comes from the confusionMatrix() function in caret, another way of knowing this is that it's a ZeroR model and upon further investigation it's just always 1.00 sensitivity and 0.00 specificity! That's because the ZeroR model uses zero rules and zero attributes, just gives a majority prediction.
