Search code examples
information-retrievalrocprecision-recall

zero denominator in ROC and Precision-Recall?


This question is about the ROC curve, but it can be generalized to the Precision-Recall curve.

As you may know, the row curve is drawn using the False Positives Rate (FPR) and the True Positives Rate (TPR), where:

TPR = tp / (tp + fn ) // tp= true positives, fn = false negatives
FPR = fp / (fp + tn ) // fp = false positives, tn = true negatives

But what if one of the denominators is 0? The optimal value of TPR is 1, while FPR is 0 (in fact the optimal point in the ROC space is (0,1)).

This is particularly important if we use the ROC curve to compute the optimal threshold in a classification system.

For example, in my case, my system for a particular configuration never returns fp or tn, so FPR has always 0 as denominator

Update for clarification:

I'm using T-F/P-N and the ROC curve to decide a threshold value for my classifier. In particular, I compute these values for a given cut-off on the top-k most similar elements in the dataset w.r.t. the given query. So it happens that if we consider only the top-1 elements, the T-F/P-N are calculated only on very similar objects, so it's very realistic that the classifier doesn't return negatives. As result, the threshold is very strict, but the classifier is very precise. Something like "I don't know what to answer many times, but when I do, I give the correct answer almost 100% of the times".

Of course, if we increase k negatives appears and the threshold increase. As result, the classifier answer more often, but the probability of wrong results is higher.

So I think I will keep k as a tuning parameter, depending on the considered application: if we want a very precise classifier, we will set a small k, otherwise if we contemplate false positives we can choose a larger k.

My application:

My app is a similarity cache for images: when received a query, the system check if there is a "enough similar" cached image. If so, returns the same result, otherwise query the back end system. "similar enough" is the threshold. To decide a good threshold, we select a subset of dataset images, the so called "queries" in this question. To decide the threshold, as I explained above, as first approach I select the top-1 elements, i.e. the most similar image w.r.t. the query (one of the setup image) in the whole dataset. This is done for each query. From there, I compute the threshold using the ROC curve as explained above. So, if we use n queries, we obtain n predictions.

If we use this approach, the resulting threshold is very strict, because since we consider the top-1 element, the average distance is very small (and very precise) and so we obtain a strict threshold.

If we use the top-k approach (say k=10), we select the top k most similar images and we do the same approach as above. The threshold becomes larger, we have more cache hits, but also the probability of false-positives is higher. In this case we obtain k*n predictions. If we set k as the whole dataset with size m, we obtaiin k*m predictions.

I hope this clarifies my previous UPDATE


Solution

  • You should just check whether the numerator equals 0 before computing your ratio. For example

    if (fp == 0):
      return 0.0
    return fp/(fp + tn)