Search code examples
rdataframesum

what is the function to sum rows in data frame in R?


I want to sum rows based on two different categorical columns. my data is like this :

df <- data.frame(country = c('US', 'US', 'US', 'UK', 'UK', 'UK', 'UK'),
             team = c('A1', 'A2', 'B1', 'B1', 'B2', 'C1', 'C2'),
             x1990 = c(4, 7, 8, 8, 8, 9, 12),
             x2005 = c(3, 3, 4, 4, 6, 7, 7))

and I want to prepare it like this:

df_sum <- data.frame(country = c('US', 'US', 'UK','UK'),
                 team = c('A', 'B','B', 'C'),
                 x1990 = c(11, 8,16,21),
                 x2005 = c(6, 4,10, 14))

Solution

  • You may try using dplyr like

    library(dplyr)
    df %>%
      mutate(team = sub("^([[:alpha:]]*).*", "\\1", team)) %>%
      group_by(country, team) %>%
      summarize(across(where(is.numeric), sum)) %>%
      arrange(desc(country))
    
      country team  x1990 x2005
      <chr>   <chr> <dbl> <dbl>
    1 US      A        11     6
    2 US      B         8     4
    3 UK      B        16    10
    4 UK      C        21    14