Search code examples
rtidyverseacross

Across with NA values


I was trying to take the mean of some columns using across and there was an issue making new different columns on for each column of the mean I used. Is it working properly?

    library(tidyverse)
    cars %>% 
as_tibble() %>% 
add_case(speed = 11, dist = NA, .before = 1) %>% 
add_column(names = str_c("a",1:51)) %>% 
rename_with(.cols =  -names, ~str_c("one_",.x)) %>% 
group_by(names) %>% 
mutate(two = across(starts_with("one"), .fns = mean))

In the vignette it shows this example:

df %>% mutate_at(vars(c(x, starts_with("y"))), mean)
# ->
df %>% mutate(across(c(x, starts_with("y")), mean, na.rm = TRUE))

I would expect that in every case with NA it would produce NA instead of another column.


Solution

  • Don't necessarily see the use of across here if you want to take rowwise mean of two columns.

    library(dplyr)
    cars %>% 
      as_tibble() %>% 
      add_case(speed = 11, dist = NA, .before = 1) %>% 
      add_column(names = str_c("a",1:51)) %>% 
      rename_with(.cols =  -names, ~str_c("one_",.x)) %>% 
      mutate(two = rowMeans(select(., starts_with('one')), na.rm = TRUE))
    

    You can use rowwise with c_across but it is going to be inefficient then rowMeans.

    cars %>% 
      as_tibble() %>% 
      add_case(speed = 11, dist = NA, .before = 1) %>% 
      add_column(names = str_c("a",1:51)) %>% 
      rename_with(.cols =  -names, ~str_c("one_",.x)) %>% 
      rowwise() %>%
      mutate(two = mean(c_across(starts_with('one')), na.rm = TRUE))