Search code examples
rdplyrcase-whenacross

combine mutate(across) and case_when to fill multiple columns with 0 depending on condition


In a dplyr workflow I try to paste a 0 in each column of a dataframe after the newvar column when newvar == 0, else do nothing. I modified the iris dataset:

library(dplyr)
n <- 150 # sample size

iris1 <- iris %>% 
    mutate(id = row_number(), .before = Sepal.Length) %>% 
    mutate(newvar = sample(c(0,1), replace=TRUE, size=n), .before = Sepal.Length ) %>% 
    mutate(across(.[,3:ncol(.)], ~ case_when(newvar==0 ~ 0)))

I tried a solution like here How to combine the across () function with mutate () and case_when () to mutate values in multiple columns according to a condition?. My understanding:

  1. with .[,3:ncol(.)] I go through the columns after newvar column.
  2. with case_when(newvar==0 I try to set the condition.
  3. with ~ 0 after newvar==0 I try to say paste 0 if condition is fulfilled.

I know that I am doing something wrong, but I don't know what! Thank you for your help.


Solution

  • .[,3:ncol(.)] are the values of the column and not the actual column numbers. Using 3:ncol(.) should work fine.

    In general, it is also better to avoid referring column by positions and instead use their names. You can do this in one mutate call.

    library(dplyr)
    
    n <- 150
    
    iris %>% 
      mutate(id = row_number(), 
            newvar = sample(c(0,1), replace=TRUE, size=n), 
            across(Sepal.Length:Petal.Width, ~ case_when(newvar==0 ~ 0, 
                                                         newvar == 1 ~ .)))