Search code examples
raggregatedcast

dcast function taking arguments from two value variables


Say I have an example dataframe with following structure

cars=c("A","A","A","A", "B","B","B","B", "C","C","C","C","A","A","A","A", "B","B","B","B", "C","C","C","C")
vendor=c("d","e","f","g", "d","e","f","g", "d","e","f","g", "d","e","f","g", "d","e","f","g", "d","e","f","g")
state=c(1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2)
PS_mean=c(100, 110, 120, 130, 90, 95, 140, 180, 70, 80, 120, 150, 100, 110, 120, 130, 90, 95, 140, 180, 70, 80, 120, 150)
PS_stdv=c(10, 20, 30, 40, 10, 20, 30, 40, 10, 20, 30, 40, 10, 20, 30, 40, 10, 20, 30, 40, 10, 20, 30, 40)
mycars=data.frame(cars, vendor, state, PS_mean, PS_stdv)

I now want to apply a reshaping with dcast like

mycars_cov<-dcast(setDT(mycars[c('cars','state','PS_mean','PS_stdv')]), cars~state, value.var=c("PS_mean", "PS_stdv"), car_PS_var("PS_mean", "PS_stdv"))

As you can see, function "car_PS_var" is user-defined with two inputs

car_PS_var<- function(x,y){
   x<-as.numeric(x)
   y<-as.numeric(y)
   z=sd(x)*sd(y)/mean(x)
   return(z)
}

I do not know how to apply a function which takes the two "value.var" as arguments, and return one. Normally with dcast you can only apply a function to only one variable, thats why car_PS_var("PS_mean", "PS_stdv") does not work

In this form R will throw some errors because it cannot take two inputs in dcast function.

So how can I correctly do that? If you suggest any other R method which does the task, is also fine


Solution

  • Not sure if I understood your goal but from my interpretation, a quick and dirty way is to group by cars and state first, create the new column, then dcast the new data table

    mycars <- as.data.table(mycars)
    
    temp <- mycars[, .(z = car_PS_var(PS_mean, PS_stdv)),
                  by = c("cars", "state")]
    
    dcast(temp, cars ~ state)
    
       cars        1        2
    1:    A 1.449275 1.449275
    2:    B 4.325825 4.325825
    3:    C 4.545340 4.545340