Search code examples
rsplitcombn

Create combinations within a split group in r


With the data frame below of Locations, Days, and Quantities, I'm searching for a solution to create combinations of quantities by Location across each Day. In production, these combinations may grow pretty large, so a data.table or plyr approach would be appreciated.

library(gtools)    
dat <- data.frame(Loc = c(51,51,51,51,51), Day = c("Mon","Mon","Tue","Tue","Wed"),
Qty = c(1,2,3,4,5))

The output for this example should be:

  Loc Day  Qty
1  51 Mon   1
2  51 Tue   3
3  51 Wed   5

4  51 Mon   1
5  51 Tue   4
6  51 Wed   5

7  51 Mon   2
8  51 Tue   3
9  51 Wed   5

10  51 Mon  2
11  51 Tue  4
12  51 Wed  5

I've tried a few nested lapply's which gets me close, but then I'm not sure how to take it to the next step and use the combn() function within each store.

lapply(split(dat, dat$Loc), function(x) {
      lapply(split(x, x$Day), function(y) {
          y$Qty
    })                
})

I'm able to get the correct combinations if each Store > Day group was in it's own list, but am struggling how to get there from a data frame using a split-apply-combine method.

loc51_mon <- c(1,2)
loc51_tue <- c(3,4)
loc51_wed <- c(5)

unlist(lapply(loc51_mon, function(x) {
    lapply(loc51_tue, function(y) {
         lapply(loc51_wed, function(z) {
              combn(c(x,y,z), 3)
         })
    })
}), recursive = FALSE)

[[1]]
[[1]][[1]]
     [,1]
[1,]    1
[2,]    3
[3,]    5

[[2]]
[[2]][[1]]
     [,1]
[1,]    1
[2,]    4
[3,]    5

[[3]]
[[3]][[1]]
     [,1]
[1,]    2
[2,]    3
[3,]    5

[[4]]
[[4]][[1]]
     [,1]
[1,]    2
[2,]    4
[3,]    5

Solution

  • This should work however further complexity would require changes to the function:

    library(data.table) 
    dat <- data.frame(Loc = c(51,51,51,51,51), Day = c("Mon","Mon","Tue","Tue","Wed"),
                      Qty = c(1,2,3,4,5), stringsAsFactors = F)
    setDT(dat)
    
    comb_in <- function(Qty_In,Day_In){
        temp_df <- aggregate(Qty_In ~ Day_In, cbind(Qty_In, as.character(Day_In)), paste, collapse = "|")
        temp_list <- strsplit(temp_df$Qty_In, split = "|", fixed = T)
        names(temp_list) <- as.character(temp_df$Day)
        melt(as.data.table(expand.grid(temp_list))[, case_group := .I], id.vars = "case_group", variable.name = "Day", value.name = "Qty")
    }
    
    dat[, comb_in(Qty_In = Qty, Day_In = Day), by = Loc][order(Loc,case_group,Day)]
        Loc case_group Day Qty
     1:  51          1 Mon   1
     2:  51          1 Tue   3
     3:  51          1 Wed   5
     4:  51          2 Mon   2
     5:  51          2 Tue   3
     6:  51          2 Wed   5
     7:  51          3 Mon   1
     8:  51          3 Tue   4
     9:  51          3 Wed   5
    10:  51          4 Mon   2
    11:  51          4 Tue   4
    12:  51          4 Wed   5
    

    You can now filter by case_group to get each combination