Search code examples
pythondata-miningdata-scienceunsupervised-learning

Unsupervised Learning with Time Series Data


I have a data set consisting of many series. I want to build a model that is able to determine which series, or set of series, are independent of the bunch and which of the series are dependent along with their dependencies.

In other words, imagine I have the series A, B, and C and I don't know anything about them. An output could be that A and B are independent, but knowing B at time t-1 helps us predict C at time t.

What kind of problem would this be called? How can I solve this if I have N series instead of just 3? I am familiar with machine learning techniques in Python but I'm not sure if there are other better methods out there, such as genetic algorithms, etc, that could help me find the solution.

Maybe I was thinking something along the lines of unsupervised learning, or clustering, but I'm not sure how that can be done using time-series. Any thoughts on this?

If you can provide any pointers with links, etc. I would be forever grateful!


Solution

  • If I understand your question in the right way you want to know if your time series (i.e. vectors) are correlated or not. To determine that I would encourage you to use spicy.stats or numpy.corrcoef

    If you just want to determine if they behave in the same way you can calculate their percent changes during time (to normalise it) and compare these.