Search code examples
algorithmmachine-learningtime-seriesbaseline

Algorithms for establishing baselines from time series data


In my app I collect a lot of metrics: hardware/native system metrics (such as CPU load, available memory, swap memory, network IO in terms of packets and bytes sent/received, etc.) as well as JVM metrics (garbage collectins, heap size, thread utilization, etc.) as well as app-level metrics (instrumentations that only have meaning to my app, e.g. # orders per minute, etc.).

Throughout the week, month, year I see trends/patterns in these metrics. For instance when cron jobs all kick off at midnight I see CPU and disk thrashing as reports are being generated, etc.

I'm looking for a way to assess/evaluate metrics as healthy/normal vs unhealthy/abnormal but that takes these patterns into consideration. For instance, if CPU spikes around (+/- 5 minutes) midnight each night, that should be considered "normal" and not set off alerts. But if CPU pins during a "low tide" in the day, say between 11:00 AM and noon, that should definitely cause some red flags to trigger.

I have the ability to store my metrics in a time-series database, if that helps kickstart this analytical process at all, but I don't have the foggiest clue as to what algorithms, methods and strategies I could leverage to establish these cyclical "baselines" that act as a function of time. Obviously, such a system would need to be pre-seeded or even trained with historical data that was mapped to normal/abnormal values (which is why I'm learning towards a time-series DB as the underlying store) but this is new territory for me and I don't even know what to begin Googling so as to get back meaningful/relevant/educated solution candidates in the search results. Any ideas?


Solution

  • You could categorize each metric (CPU load, available memory, swap memory, network IO) with the day and time as bad or good for each metric. Come up with a set of data for a given time frame with metric values and whether they are good or bad. Train a model using 70% of the data with the good and bad answers in the data. Then test the trained model using the other 30% of data without the answers to see if you get the predicted results (good,bad) from the model. You could use a classification algorithm.