Search code examples
rdplyrdata.tabletidyversetidyr

Count in how many different groups I can find an element in R dplyr


My data looks like this

library(tidyverse)

df3 <- tibble(fruits=c("apple","banana","ananas","apple","ananas","apple","ananas"),
              position=c("135","135","135","136","137","138","138"), 
              counts = c(100,200,100,30,40,50,100))

df3
#> # A tibble: 7 × 3
#>   fruits position counts
#>   <chr>  <chr>     <dbl>
#> 1 apple  135         100
#> 2 banana 135         200
#> 3 ananas 135         100
#> 4 apple  136          30
#> 5 ananas 137          40
#> 6 apple  138          50
#> 7 ananas 138         100

Created on 2022-02-21 by the reprex package (v2.0.1)

I want to group_by fruits and count in which & in how many different positions each fruit belongs. I want my data to look like

fruits    groups    n_groups     sum_count
apple  135,136,138      3            180
banana      135         1            200
ananas 135,137,138      3            240

the groups column could be a list of characters. I do not care much about the structure.

Thank you for your time. Any guidance is appreciated.


Solution

  • I don't really understand what you want from your description, but you can accomplish your desired data.frame by grouping it by fruits:

    df3 %>% 
      group_by(fruits) %>% 
      summarise(groups = list(position), n_groups = n(), counts = sum(counts))
    
      fruits groups    n_groups counts
      <chr>  <list>       <int>  <dbl>
    1 ananas <chr [3]>        3    240
    2 apple  <chr [3]>        3    180
    3 banana <chr [1]>        1    200