Search code examples
rtime-seriesforecasttsibblefable

Mutate a column in a tsibble dataframe, applying a Box-Cox transformation


I am a big fan of Hyndman's packages, but stumbled with Box-Cox transformation.

I have a dataframe

class(chicago_sales)
[1] "tbl_ts"     "tbl_df"     "tbl"        "data.frame"

I am trying to mutate an extra column, where the Mean_price variable will be transformed.

foo <- chicago_sales %>% 
mutate(bc = BoxCox(x = chicago_sales$Median_price, lambda = 
BoxCox.lambda(chicago_sales$Median_price)))

gives me some result (probably wrong too) and cannot apply autoplot.

I also tried to apply the code from Hyndman's book, but failed.

What am I doing wrong? Thanks!

UPDATED:


Solution

  • Issue, inside tsibbles, when using dplyr, you do not call chicago_sales$Median_price, but just Median_price. When using tsibbles I would advice using fable and fabletools, but if you are using forecast, it should work like this:

    library(tsibble)
    library(dplyr)
    library(forecast)
    
    pedestrian %>% 
      mutate(bc = BoxCox(Count, BoxCox.lambda(Count)))
    # A tsibble: 66,037 x 6 [1h] <Australia/Melbourne>
    # Key:       Sensor [4]
       Sensor         Date_Time           Date        Time Count    bc
       <chr>          <dttm>              <date>     <int> <int> <dbl>
     1 Birrarung Marr 2015-01-01 00:00:00 2015-01-01     0  1630 11.3 
     2 Birrarung Marr 2015-01-01 01:00:00 2015-01-01     1   826  9.87
     3 Birrarung Marr 2015-01-01 02:00:00 2015-01-01     2   567  9.10
     4 Birrarung Marr 2015-01-01 03:00:00 2015-01-01     3   264  7.65
     5 Birrarung Marr 2015-01-01 04:00:00 2015-01-01     4   139  6.52
     6 Birrarung Marr 2015-01-01 05:00:00 2015-01-01     5    77  5.54
     7 Birrarung Marr 2015-01-01 06:00:00 2015-01-01     6    44  4.67
     8 Birrarung Marr 2015-01-01 07:00:00 2015-01-01     7    56  5.04
     9 Birrarung Marr 2015-01-01 08:00:00 2015-01-01     8   113  6.17
    10 Birrarung Marr 2015-01-01 09:00:00 2015-01-01     9   166  6.82
    # ... with 66,027 more rows
    

    I used a built in dataset from the tsibble package as you did not provide a dput of chicago_sales.