I have the following data frame:
(dput() for testing bellow)
structure(list(V1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "797 Fleet", class = "factor"), V2 = structure(c(5L, 1L, 4L, 3L, 2L, 5L, 1L, 4L, 3L, 2L, 5L, 1L, 4L, 3L, 2L, 5L), .Label = c("Available Hours", "Cycle Time", "Performance", "Production time", "Units"), class = "factor"), V3 = c(51, 2989.601111, 2498.85, 540.8754973, 39.93337086, 52, 30010.73389, 24946.62833, 529.4659407, 40.81742793, 36, 20778.5525, 17174.18722, 535.7960907, 40.36234152, 19)), .Names = c("V1", "V2", "V3"), class = "data.frame", row.names = c(NA, -16L))
i need to aggregate the data but with different functions for different features; for the 797 fleet, the units and production time should be added, but the performance an cycle time should be averaged.
I have just tried aggregate with two functions, but i get two columns one with all added and other with all averaged and i need only one column.
How can i accomplish that?
Here's an idea using data.table
:
library(data.table)
fun_list <- list("Units" = sum, "Production time" = sum, "Performance" = mean, "Cycle Time" = mean)
setDT(df)[V2 %in% names(fun_list), .(res = fun_list[[as.character(.BY[[2]])]](V3)),by = .(V1, V2)]
# V1 V2 res
#1: 797 Fleet Units 158.00000
#2: 797 Fleet Production time 44619.66555
#3: 797 Fleet Performance 535.37918
#4: 797 Fleet Cycle Time 40.37105
Let's unpack this solution a little. First, we store a map of the functions we want to apply to each of the values in V2
. This list is simply a list of functions. E.g. "Units" = sum
means we want to apply sum
to the "Units"
group. To see how this works try: fun_list[["Units"]](c(1,2,3))
.
We then use this in our group by operation in data.table
. We use the V2
value stored in .BY
to index our function list. That is for each V2
value we choose a function from our list to apply. This is accomplished by fun_list[[as.character(.BY[[2]])]]
(Note we need to as.character
since .BY
is a factor). Lastly we apply that function to V3
which is what (V3)
does in the last part of the code fun_list[[as.character(.BY[[2]])]](V3))
!