Search code examples
rgroup-byconstantsslice

how to use a variable as a parameter of the dplyr::slice_max() function in R


Given a data.frame:

tibble(group = c(rep("A", 4), rep("B", 4), rep("C", 4)),
        value = runif(12),
        n_slice = c(rep(2, 4), rep(1, 4), rep(3, 4)) )

# A tibble: 12 x 3
   group  value n_slice
   <chr>  <dbl>   <dbl>
 1 A     0.853        2
 2 A     0.726        2
 3 A     0.783        2
 4 A     0.0426       2
 5 B     0.320        1
 6 B     0.683        1
 7 B     0.732        1
 8 B     0.319        1
 9 C     0.118        3
10 C     0.0259       3
11 C     0.818        3
12 C     0.635        3

I'd like to slice by group with diferent number of rows in each group

I tried the code below but I get notified that "n" must be a constant

re %>% 
   group_by(group) %>% 
   slice_max(value, n = n_slice)

Erro: `n` must be a constant in `slice_max()`.

Expected output:

  group value n_slice
  <chr> <dbl>   <dbl>
1 A     0.853       2
2 A     0.783       2
3 B     0.732       1
4 C     0.818       3
5 C     0.635       3
6 C     0.118       3

Solution

  • In this case, an option is with group_modify

    library(dplyr)
    re %>% 
       group_by(group) %>% 
       group_modify(~ .x %>%
              slice_max(value, n = first(.x$n_slice))) %>%
       ungroup
    

    -output

    # A tibble: 6 × 3
      group value n_slice
      <chr> <dbl>   <dbl>
    1 A     0.931       2
    2 A     0.931       2
    3 B     0.722       1
    4 C     0.591       3
    5 C     0.519       3
    6 C     0.494       3
    

    Or another option is to summarise using cur_data() and then unnest

    library(tidyr)
    re %>%
        group_by(group) %>%
        summarise(out = list(cur_data() %>% 
            slice_max(value, n = first(n_slice)))) %>% 
        unnest(out)
    

    -output

    # A tibble: 6 × 3
      group value n_slice
      <chr> <dbl>   <dbl>
    1 A     0.931       2
    2 A     0.931       2
    3 B     0.722       1
    4 C     0.591       3
    5 C     0.519       3
    6 C     0.494       3