Search code examples
rdatabasedata.tabletime-seriesanalytics

R data.table: Rebase each group within the panel by a value found in another column


Let's say my dataset looks like this:

data.table(groups = rep(c("A","B"), each=5), time = rep(2011:2015, 2), value = runif(10,95,105))

    groups time     value
 1:      A 2011  97.66647
 2:      A 2012 103.00917
 3:      A 2013 103.02462
 4:      A 2014 103.25105
 5:      A 2015 101.94881
 6:      B 2011  97.69248
 7:      B 2012  95.63296
 8:      B 2013  98.40150
 9:      B 2014 104.14164
10:      B 2015 103.61722

I would like to rebase the value, within each group, by a value corresponding to chosen year (say, 2013). So, for group A I would like to divide each value by 103.02462 (value for 2013), for group B by 98.40150, etc....

All the solutions that I can think of are incredibly complex, it would be great if someboudy would share their ideas


Solution

  • After grouping by 'groups', get the 'value' that corresponds to 'time' value of 2013 and use that to divide the column 'value'

    library(data.table)
    dt1[, value := value/value[time == 2013], by = groups]
    dt1
    #    groups time     value
    # 1:      A 2011 0.9479916
    # 2:      A 2012 0.9998500
    # 3:      A 2013 1.0000000
    # 4:      A 2014 1.0021978
    # 5:      A 2015 0.9895577
    # 6:      B 2011 0.9927946
    # 7:      B 2012 0.9718649
    # 8:      B 2013 1.0000000
    # 9:      B 2014 1.0583339
    #10:      B 2015 1.0530045
    

    Or with match

     dt1[, value := value/value[match(2013, time)], by = groups]
    

    data

    dt1 <- structure(list(groups = c("A", "A", "A", "A", "A", "B", "B", 
    "B", "B", "B"), time = c(2011L, 2012L, 2013L, 2014L, 2015L, 2011L, 
    2012L, 2013L, 2014L, 2015L), value = c(97.66647, 103.00917, 103.02462, 
    103.25105, 101.94881, 97.69248, 95.63296, 98.4015, 104.14164, 
    103.61722)), class = c("data.table", "data.frame"), row.names = c(NA, 
    -10L))