Search code examples
rperformancedplyrpurrr

Increase speed in finding "Ιncreasing dice roll sequences"


The problem was How many sequences of 9 dice rolls are increasing (e.g. 223444556). Ok, I know the answer is given by choose(14,9) but i just wanted to play around with dplyr.

A fast but not elegant way:

library(tidyverse)

expand.grid(data.frame(matrix(rep(1:6,9),ncol=9))) %>% 
 filter(X1<=X2 & X2<=X3 &X3<=X4 &X4<=X5 &X5<=X6 &X6<=X7 &X7<=X8 &X8<=X9) %>% tally

I tried the following two alternatives (without explicit reference to variable names), but they're both very slow (and memory consuming). Can you help me optimize my code using tidyverse?

 expand_grid(!!!data.frame(matrix(rep(1:6,9),ncol=9))) %>% 
  rownames_to_column(var = "grp") %>% 
  mutate(grp = as.numeric(grp)) %>% 
  pivot_longer (cols=!grp) %>% 
  group_by(grp) %>% 
  mutate(prev = lag(value)) %>% 
  filter(!is.na(prev)) %>% 
  transmute(dif=value-prev) %>% 
  summarize(res = all(dif >=0)) %>% 
  group_by(res) %>% summarize(n=n())


 9 %>% 
    rerun(1:6) %>% crossing(!!!.,.name_repair = "minimal") %>%
    set_names(glue::glue('c{1:ncol(.)}')) %>% 
    rowwise() %>%
    mutate(asc = all(diff(c_across(cols = everything())>=0))) %>% 
    filter(asc==TRUE) %>% tally

This is also slow, but not memory consuming.

 9 %>% 
  rerun(1:6) %>% crossing(!!!.,.name_repair = "minimal")  %>% 
  set_names(glue::glue('c{1:ncol(.)}')) %>% 
  filter(pmap_lgl(.,~{
    if(all(list(...) %>% flatten_dbl() %>% diff() >=0)) return(TRUE) else return(FALSE)
  })) %>% tally

Solution

  • Here is a tidyverse approach that relies on purrr:

    expand.grid(replicate(9, 1:6, FALSE)) %>%
      filter(reduce(map2(.[, -length(.)], .[, -1], ~ .x <= .y), `&`)) %>%
      tally()
    

    This is somewhat difficult to do in the contexts of pipes. We both need to compare columns n and n + 1 while reducing done to a logical vector. Then we need to filter the original dataset.

    And if you were only interested in the tally, we could sum the logical vector.

    expand.grid(replicate(9, 1:6, FALSE)) %>%
      {sum(reduce(map2(.[, -length(.)], .[, -1], ~ .x <= .y), `&`))}
    

    Finally, if you don't mind one more dependency, can parallel what you were doing with one of your approaches:

    library(matrixStats)
    
    expand.grid(replicate(9, 1:6, FALSE)) %>%
      {sum(rowAlls(rowDiffs(as.matrix(.)) >= 0L))}