Search code examples
rdata.tabledcast

add more multiple columns based on value in other columns in R


i have a data table, which contains three variables:

  1. hours: format is int, value in the range of [0,23], and it is increased
  2. mins: format is int, value is (10,20,30,40,50,60), it is also increased
  3. x: format is int

below is a simple sample:

stocks <- data.frame(
hours = c(0,0,0,0,0,0),
mins = c(10,10,10,20,20,30),
x = c(2,4,4,5,3,4)
)

output:

based on this table, i want to add more multiple columns according to hours and mins. it looks like this as below:

    0_10 0_20 0_30
     2    5    4
     4    3    
     4        

I tried to use the dcast function, but the final table just calculate the frequency of the X :(

library(data.table)
dcast(setDT(stocks), x ~ hours+mins, value.var = c("x")) 
#Aggregate function missing, defaulting to 'length'
   x 0_10 0_20 0_30
1: 2    1    0    0
2: 3    0    1    0
3: 4    2    0    1
4: 5    0    1    0

any suggestion ?

thanks !


Solution

  • We need to change the formula in dcast

    library(data.table)#1.9.7+
    dcast(setDT(stocks), rowid(hours, mins)~hours+mins, value.var = "x")[, hours := NULL][]
    #   0_10 0_20 0_30
    #1:    2    5    4
    #2:    4    3   NA
    #3:    4   NA   NA
    

    With versions < 1.9.7, we create the sequence variable grouped by 'hours', 'mins', and then do the dcast

    setDT(stocks)[, Seq := 1:.N, by = .(hours, mins)]
    dcast(stocks, Seq~hours + mins, value.var = "x")[, Seq := NULL][]