Search code examples
pythonhadoopmapreducetime-seriesholtwinters

Implementing ARIMA or Holt Winter's using Map-Reduce in Python


I am trying to deploy a time-series model using Map-Reduce in Python on a Hadoop infrastructure without using the StatsModel package. But since, I am new to Map-Reduce programming, I am unable to figure out, how to implement one. I did some reading on rolling window approaches. But still unable to get an idea of how to implement it.

My dataset looks something like this -

enter image description here


Solution

  • The code has been split in four parts, the mapper, combiner, reducer and a file (slidingwindow.py) containing all the classes. We have used the Sliding Window approach to calculate the forecast. The combiner marks all the entries forecasted successfully with 'F' and the entries that were unable to fill the window were marked as 'B' and 'E'. The reducer forecasts the entries marked 'B' and 'E' and produce the output.

    The link to the python files can be found here -

    https://github.com/abhiray92/mapreduce_arima/tree/main/Linux_Server