Search code examples
pythontensorflowmachine-learningprecision-recall

Reducing False positives ML models


Is there a nice way to enforce a limit on the false positives while training a ML model?

Let's suppose you start with a balanced dataset with two class. You develop a ML model for binary classification. As the task is easy the output distributions will be peaked respectively at 0 and 1 and overlapping around 0.5 . However what you really care about is that your false positive rate is sustainable and cannot exceed a certain amount. So at best you would like to have that for pred > 0.8 you only have one class.

At the moment i'm weighting the two class to penalise an error on the class "0".

history = model.fit(..., class_weight={0:5, 1:1}, ...)

As expected it does decrease the fpr in the region pred > 0.8 and of course it will worsen the recall of class 1.

I'm wondering if there are other ways to enforce this.

Thank you


Solution

  • Depending on your problem , you can consider one-class classification svm. This article can be useful : https://towardsdatascience.com/outlier-detection-with-one-class-svms-5403a1a1878c . The article shows also why one-class classification is better to consider instead of some other classical techniques , such as oversampling/undersampling or class-weighting. But of course it depends on the problem you want to solve.