Search code examples
ralgorithmdata.tablepanel-data

Creating n+1 variable for every observation using data.table in R?


I want to add a variable to the data table below that counts the n^th time that an ID was observed in chronological order according to year? (i.e. for every row of a certain ID, the new variable increases +1). Here is a sample of a panel-data table I'm working on:

DT <- data.table("ID"=c(1,1,1,1,2,2,3,3,3),
  "year"=c(2005,2006,2007,2008,2014,2015,2008,2009,2010))

ID, year
1, 2005
1, 2006
1, 2007
1, 2008
2, 2014
2, 2015
3, 2008
3, 2009
3, 2010

And here is the desired output with the new variable crop:

ID, year, crop
1, 2005, 1
1, 2006, 2
1, 2007, 3
1, 2008, 4
2, 2014, 1
2, 2015, 2
3, 2008, 1
3, 2009, 2
3, 2010, 3

Is this possible to do using data.table?


Solution

  • You could use rleid and by:

    DT[,crop:=rleid(year),by=ID][]
    
       ID year crop
    1:  1 2005    1
    2:  1 2006    2
    3:  1 2007    3
    4:  1 2008    4
    5:  2 2014    1
    6:  2 2015    2
    7:  3 2008    1
    8:  3 2009    2
    9:  3 2010    3