Search code examples
machine-learningrocauc

Why don't the false positive rate and true positive rate add up to one in a roc-auc curve?


The false positive rate is the x-axis spanning from 0 to 1. The true positive rate is the y-axis spanning from 0 to 1. And the graphs show data points like (.8,.8). Which if the tpr is .8 and the fpr is .8, they add up to 1.6...


Solution

  • Typically the axis are normalised using the total number of FPs or TPs in the test/validation set. Otherwise the end of the curve wouldn't be 1/1. I personally prefer to label the axis by the number of instances.

    Why to not normalise by the total number - in real applications, it gets rather complicated as you often do not have labels for all examples. The typical example for ROC curves are mass mailings. To normalise the curve correctly you would need to spam the entire world.