Search code examples
rmachine-learningtime-seriesdata-scienceforecasting

Hierarchical Time Series in R - Creating nodes (HTS)


Currently my data is in the following Data Frame

I am trying to create a hierarchal forecast model and one of the first things I need to do is create nodes for it to roll up appropriately

I have converted it to a time series doing the following below

myts <- ts(df,start = c(2014,1),end = c(2022,5),frequency = 12).

I want my two columns "exi" and "new" to be on the same level (level 1) and roll up to total. Though I am having issues understanding the structure on how to do that when using characters.

y <- hts(myts, characters = c(3,0) )

In this case, how would I specify the level and the number of nodes on the level?


Solution

  • This will do it.

    library(hts)
    df <- cbind(new=rnorm(101), exi=rnorm(101))
    myts <- ts(df, frequency=12, start=2014)
    hts(myts, nodes=list(2))
    #> Since argument characters are not specified, the default labelling system is used.
    #> Hierarchical Time Series 
    #> 2 Levels 
    #> Number of nodes at each level: 1 2 
    #> Total number of series: 3 
    #> Number of observations per series: 101 
    #> Top level series: 
    #>               Jan          Feb          Mar          Apr          May
    #> 2014  1.059021025  0.070720551  0.943501576  1.281006621  0.077826858
    #> 2015 -2.571193569 -3.521491556  0.437805199  0.634686127  0.335985240
    #> 2016  3.275815702  3.718421177 -0.397562597 -1.594078028  0.313479815
    #> 2017 -0.105944528 -1.848429387 -2.068280839 -2.373310605 -2.533045431
    #> 2018  0.158463797  0.506874814 -0.676238180 -1.642189637  1.058472963
    #> 2019 -0.172692203  0.624001783 -0.964404368  2.280085589 -0.155924897
    #> 2020 -2.220163820  0.136283001  1.762032698  0.424643151 -1.599905992
    #> 2021  1.637001865 -2.631796024  1.123368365 -1.261582780 -0.292671546
    #> 2022 -0.001872089 -1.744886881 -0.482440605 -0.112014784 -1.682054917
    #>               Jun          Jul          Aug          Sep          Oct
    #> 2014  2.257358427  0.417126947 -0.513870167 -0.061706537  0.812069716
    #> 2015  0.083398529  2.061362529 -2.334635491 -1.832884832 -0.374096584
    #> 2016  1.399752967 -0.226514090 -0.175373983  0.032604708  1.666988822
    #> 2017 -0.005404140 -1.886962968  0.586309402 -0.346177278  2.484535524
    #> 2018 -0.463699811 -0.259065966 -1.644560411  0.744973289  1.414781985
    #> 2019 -1.770046964 -0.198155900  0.624782718 -3.146784932  0.706044061
    #> 2020 -0.308583086  0.709494594 -1.080280409  0.469290221  0.012920335
    #> 2021 -0.656702788 -1.597220350 -0.581150271  0.423158948  0.303229563
    #> 2022                                                                 
    #>               Nov          Dec
    #> 2014 -0.663937133 -1.505427051
    #> 2015  0.965955466  0.584356013
    #> 2016 -2.183735185 -0.158921087
    #> 2017  1.150508529  0.355272166
    #> 2018 -0.103622748 -2.213258534
    #> 2019 -2.749029384  1.587499512
    #> 2020 -0.946231453  1.333822797
    #> 2021 -0.294289414  1.999698540
    #> 2022
    

    Created on 2022-06-10 by the reprex package (v2.0.1)

    You don't actually need the nodes argument here, as it will assume there are only 2 levels by default.

    The characters argument is designed to be used when there are more than 2 levels, and the column names contain the structural information about each series.