Search code examples
machine-learningtime-seriesanomaly-detection

Isolation Forest for time series data


I just wonder if the isolation Forest (iForest) can work with time-series data. As far as I know, iForest is used for anomaly detection and it is based on randomization techniques to randomly and recursively partition the data and then save the partition in a tree structure.

I have a theoretical question. I just wonder if the iForest can work with the time series data since it is based on some randomization techniques. Would this violate the time series characteristics as the randomization may break the time dependencies?.


Solution

  • Isolation forest will help with detecting point anomalies by default, since in principle it is just working on the rarity of these observations.

    But let’s say I am interested in anomalies in time series data. Isolation forest will be able to pick out the extreme Peaks and troughs that occur as point anomalies here but for collective anomalies, you may need to transform the data such that each observation represents a collection of observations (rolling window operations) etc.

    The reason is that in time series data you are interested in additive outliers or temporal changes and thus your observations must represent that individually if you plan to use Isolation forest. But you can try other techniques such as STL decomposition, Arima, regression trees, exponential smoothing. You should find a lot of material on how to use the above for anomaly detection in time series.