Search code examples
pythonc#anomaly-detectionlower-boundupperbound

Interval Prediction for a Time Series | Anomaly in Time Series


I have a time series in which i am trying to detect anomalies. The thing is that with those anomalies i want to have a range for which the data points should lie to avoid being the anomaly point. I am using the ML .Net algorithm to detect anomalies and I have done that part but how to get range?

If by some way I can get the range for the points in time series I can plot them and show that the points outside this range are anomalies.

I have tried to calculate the range using prediction interval calculation but that doesn't work for all the data points in the time series.

Like, assume I have 100 points, I take 100/4, i.e 25 as the sliding window to calculate the prediction interval for the next point, i.e 26th point but the problem then arises is that how to calculate the prediction interval for the first 25 points?


Solution

  • A method operating on a fixed-length sliding window generally needs that entire window to be filled, in order to make an output. In that case you must pad the input sequence in the beginning if you want to get predictions (and thus anomaly scores) for the first datapoints. It can be hard to make that padded data realistic, however, which can lead to poor predictions.

    A nifty technique is to compute anomaly scores with two different models, one going in the forward direction, the other in the reverse direction, to get scores everywhere. However now you must decide how to handle the ares where you have two sets of predictions - to use min/max/average anomaly score.

    There are some models that can operate well on variable-length inputs, like sequence to sequence models made with Recurrent Neural Networks.