Search code examples
rdplyrrowsumacross

Why does my R mutate across with rowSums not work (Error: Problem with `mutate()` input `..2`. x 'x' must be numeric ℹ Input `..2` is `rowSums(.)`.)?


I'm trying to learn how to use the across() function in R, and I want to do a simple rowSums() with it. However, I keep getting this error:

Error: Problem with mutate() input ..2. x 'x' must be numeric ℹ Input ..2 is rowSums(., na.rm = TRUE).

Yet, all my relevant columns are numeric. Any help any explanation why I'm getting this error would be greatly appreciated!

Here's a reproducible example:

library(dplyr)
test <- tibble(resource_name = c("Justin", "Corey", "Justin"),
       project = c("P1", "P2", "P3"),
       sep_2021 = c(1, 2, NA),
       oct_2021 = c(5, 2, 1))


test %>%
  select(resource_name, project, sep_2021, oct_2021) %>%
  mutate(total = across(contains("_20")), rowSums(., na.rm = TRUE))

And here's why I'm going for

answer <-  tibble(resource_name = c("Justin", "Corey", "Justin"),
                  project = c("P1", "P2", "P3"),
                  sep_2021 = c(1, 2, NA),
                  oct_2021 = c(5, 2, 1),
                  total = c(6, 4, 1))

Note: my real dataset has many columns, and the order is variable. Because of that, I really want to use the contains("_20") portion of my code and not the indices.


Solution

  • We may use adorn_totals

    library(dplyr)
    library(janitor)
    test %>%
         adorn_totals("col", name = "total")
    

    -output

      resource_name project sep_2021 oct_2021 total
            Justin      P1        1        5     6
             Corey      P2        2        2     4
            Justin      P3       NA        1     1
    

    With rowSums and across, the syntax would be

    test %>% 
       mutate(total = rowSums(across(contains("_20")), na.rm = TRUE))
    

    -output

    # A tibble: 3 x 5
      resource_name project sep_2021 oct_2021 total
      <chr>         <chr>      <dbl>    <dbl> <dbl>
    1 Justin        P1             1        5     6
    2 Corey         P2             2        2     4
    3 Justin        P3            NA        1     1
    

    In the OP's code, the across selects the columns, but the rowSums is done on the entire data (.) instead of the one that is selected