Search code examples
rdplyrgroup-byrowsum

How to find sum of certain rows in R to get a grand total per row?


I have a dataset that has employees' capacity each month, and I want to get a total for each employee across all months:

library(dplyr)
data <- tibble(employee = c("Justin", "Corey","Sibley", "Justin", "Corey","Sibley"),
               education = c("graudate", "student", "student", "graudate", "student", "student"),
               fte_max_capacity = c(1, 2, 3, 1, 2, 3),
               project = c("big", "medium", "small", "medium", "small", "small"),
               aug_2021 = c(1, 1, 1, 1, 1, 1),
               sep_2021 = c(1, 1, 1, 1, 1, 1),
               oct_2021 = c(1, 1, 1, 1, 1, 1),
               nov_2021 = c(1, 1, 1, 1, 1, 1))

I've tried following using the code found here, but I get this error:

data %>%
  dplyr::select(-contains("project")) %>%
  dplyr::group_by(employee) %>%
  mutate(sum = rowSums(select(., vars(contains("_20")))))

Error: Problem with `mutate()` input `sum`.
x Must subset columns with a valid subscript vector.
x Subscript has the wrong type `quosures`.
ℹ It must be numeric or character.
ℹ Input `sum` is `rowSums(select(., vars(contains("_20"))))`.
ℹ The error occurred in group 1: employee = "Corey".

I also tried this a modified version of the solution from this website. But I also get an error, despite all the relevant columns being numeric:

data %>%
  dplyr::select(-contains("project")) %>%
  dplyr::group_by(employee) %>%
  mutate_at(vars(contains("_20"), rowSums(., na.rm = T)))

Error: 'x' must be numeric


Solution

  • It is a grouped data, use cur_data() to do the select otherwise, the grouped variable will also be present as attribute and thus cause the error

    library(dplyr)
    data %>%
      dplyr::select(-contains("project")) %>%
      dplyr::group_by(employee) %>%
      dplyr::mutate(sum = sum(rowSums(select(cur_data(), contains("_20"))))) %>%
      ungroup
    

    -ouptut

    # A tibble: 6 x 8
      employee education fte_max_capacity aug_2021 sep_2021 oct_2021 nov_2021   sum
      <chr>    <chr>                <dbl>    <dbl>    <dbl>    <dbl>    <dbl> <dbl>
    1 Justin   graudate                 1        1        1        1        1     8
    2 Corey    student                  2        1        1        1        1     8
    3 Sibley   student                  3        1        1        1        1     8
    4 Justin   graudate                 1        1        1        1        1     8
    5 Corey    student                  2        1        1        1        1     8
    6 Sibley   student                  3        1        1        1        1     8