Reshape data table

I have a data table like (data is not necessarily ordered by 'col1')

    col0    col1      col2
1:  abc       1         a
2:  abc       2         b 
3:  abc       3         c 
4:  abc       4         d 
5:  abc       5         e
6:  def       1         a
7:  def       2         b 
8:  def       3         c 
9:  def       4         d 
10: def       5         e

I want to reshape it the following way

    col0      col1      col2      new_1   new_2   new_3   new_4
1:  abc         1         a         NA      NA       NA      NA
2:  abc         2         b         a       NA       NA      NA
3:  abc         3         c         b       a        NA      NA
4:  abc         4         d         c       b        a       NA 
5:  abc         5         e         d       c        b       a
6:  def         1         a         NA      NA       NA      NA
7:  def         2         b         a       NA       NA      NA
8:  def         3         c         b       a        NA      NA
9:  def         4         d         c       b        a       NA 
10: def         5         e         d       c        b       a

Basically I want to get previously occurred values of col2 for each row in the same row as above and if there is none the corresponding new column should say NA.

I can of course do it by merge on col2 5 times but I need to do this on a large table (in that case I will have to merge 20-30 times).

What is the best way to achieve it in R in 1 or 2 lines?

Solution

We can use shift from the devel version of data.table i.e. v1.9.5 (Instructions to install the devel version are here. By default, the type in shift is lag. We can specify n as a vector, in this case 1:4. We assign (:=) the output to new columns.

library(data.table)#v1.9.5+
DT[, paste('new', 1:4, sep="_") := shift(col2, 1:4)]
DT
#   col1 col2 new_1 new_2 new_3 new_4
#1:    1    a    NA    NA    NA    NA
#2:    2    b     a    NA    NA    NA
#3:    3    c     b     a    NA    NA
#4:    4    d     c     b     a    NA
#5:    5    e     d     c     b     a

For the new dataset 'DT2', we need to group by 'col0' and then do the shift on 'col2'

DT2[, paste('new', 1:4, sep="_") := shift(col2, 1:4), by = col0]
DT2
#   col0 col1 col2 new_1 new_2 new_3 new_4
# 1:  abc    1    a    NA    NA    NA    NA
# 2:  abc    2    b     a    NA    NA    NA
# 3:  abc    3    c     b     a    NA    NA
# 4:  abc    4    d     c     b     a    NA
# 5:  abc    5    e     d     c     b     a
# 6:  def    1    a    NA    NA    NA    NA
# 7:  def    2    b     a    NA    NA    NA
# 8:  def    3    c     b     a    NA    NA
# 9:  def    4    d     c     b     a    NA
#10:  def    5    e     d     c     b     a

data

df1 <- structure(list(col1 = 1:5, col2 = c("a", "b", "c", "d", "e"), 
new_1 = c(NA, "a", "b", "c", "d"), new_2 = c(NA, NA, "a", 
"b", "c"), new_3 = c(NA, NA, NA, "a", "b"), new_4 = c(NA, 
NA, NA, NA, "a")), .Names = c("col1", "col2", "new_1", "new_2", 
"new_3", "new_4"), class = "data.frame", row.names = c(NA, -5L
))

DT <- as.data.table(df1)

df2 <- structure(list(col0 = c("abc", "abc", "abc", "abc", "abc", 
"def", 
"def", "def", "def", "def"), col1 = c(1L, 2L, 3L, 4L, 5L, 1L, 
2L, 3L, 4L, 5L), col2 = c("a", "b", "c", "d", "e", "a", "b", 
 "c", "d", "e")), .Names = c("col0", "col1", "col2"), 
class = "data.frame", row.names = c(NA, -10L))
DT2 <- as.data.table(df2)