Search code examples
machine-learningartificial-intelligencemetricsevaluation

How to evaluate unsupervised anomaly detection


I am trying to solve a regression problem by predicting a continuous value using machine learning. I have a dataset which composed of 6 float columns.

The data come from low price sensors, this explain that very likely we will have values that can be considered out of the ordinary. To fix the problem, and before predicting my continuous target, I will predict data anomalies, and use him as a data filter, but the data that I have is not labeled, that's mean I have unsupervised anomaly detection problem.

The algorithms used for this task are Local Outlier Factor, One Class SVM, Isolation Forest, Elliptic Envelope and DBSCAN.

After fitting those algorithms, it is necessary to evaluate them to choose the best one. Can anyone have an idea how to evaluate an unsupervised algorithm for anomaly detection ?


Solution

  • The only way is to generate synthetic anomalies which mean to introduce outliers by yourself with the knowledge of how a typical outlier will look like.