Search code examples
rgroup-bysummarize

group data frame on a value for different (unknown) values


This is the example data frame:

Codes <- c("70", "70", "60", "60", "60", "60", "50")

Locations <- c("a", "a", "a", "b", "b", "b", "b")

df <- data.frame(Cases, Codes, Locations) 

I want to group and summarize the codes but for each location. It has to be a function though, that works with unknown number of locations. The result should be a data frame (or two data frames (one for each location)) that shows me the number of cases for each code for each location.

I know that it is simple, if one knows the location. Just filter the data frame for each location and use "dplyr::group_by" and "dplyr::summarize". But i want it as an automatic function, where i beforehand do not know, how many different locations there are.

I tried to do it with the function dplyr::group_split, but that returns a list of tibbles on which i can't perform dplyr::group_by.

This is the expected output:

      Codes     Location A           Codes      Location B
      70            2                60            3
      60            1                50            1

Thanks for answering in advance, i struggle with this big time.


Solution

  • We could use count and then split the dataframe based on Location to get list of dataframes.

    df_list <- df %>% count(Locations, Codes, sort = TRUE) %>% group_split(Locations)
    
    #[[1]]
    # A tibble: 2 x 3
    #  Locations Codes     n
    #  <chr>     <chr> <int>
    #1 a         70        2
    #2 a         60        1
    
    #[[2]]
    # A tibble: 2 x 3
    #  Locations Codes     n
    #  <chr>     <chr> <int>
    #1 b         60        3
    #2 b         50        1