Search code examples
rselecttidyrtidyselect

R Replace NA for all Columns Except *


library(tidyverse)
df <- tibble(Date = c(rep(as.Date("2020-01-01"), 3), NA),
             col1 = 1:4,
             thisCol = c(NA, 8, NA, 3),
             thatCol = 25:28,
             col999 = rep(99, 4))
#> # A tibble: 4 x 5
#>   Date        col1  thisCol thatCol col999
#>   <date>     <int>    <dbl>   <int>  <dbl>
#> 1 2020-01-01     1       NA      25     99
#> 2 2020-01-01     2        8      26     99
#> 3 2020-01-01     3       NA      27     99
#> 4 NA             4        3      28     99

My actual R data frame has hundreds of columns that aren't neatly named, but can be approximated by the df data frame above.

I want to replace all values of NA with 0, with the exception of several columns (in my example I want to leave out the Date column and the thatCol column. I'd want to do it in this sort of fashion:

df %>% replace(is.na(.), 0)
#> Error: Assigned data `values` must be compatible with existing data.
#> i Error occurred for column `Date`.
#> x Can't convert <double> to <date>.
#> Run `rlang::last_error()` to see where the error occurred.

And my unsuccessful ideas for accomplishing the "everything except" replace NA are shown below.

df %>% replace(is.na(c(., -c(Date, thatCol)), 0))
df %>% replace_na(list([, c(2:3, 5)] = 0))
df %>% replace_na(list(everything(-c(Date, thatCol)) = 0))

Is there a way to select everything BUT in the way I need to? There's hundred of columns, named inconsistently, so typing them one by one is not a practical option.


Solution

  • You can use mutate_at :

    library(dplyr)
    

    Remove them by Name

    df %>% mutate_at(vars(-c(Date, thatCol)), ~replace(., is.na(.), 0))
    

    Remove them by position

    df %>% mutate_at(-c(1,4), ~replace(., is.na(.), 0))
    

    Select them by name

    df %>% mutate_at(vars(col1, thisCol, col999), ~replace(., is.na(.), 0))
    

    Select them by position

    df %>% mutate_at(c(2, 3, 5), ~replace(., is.na(.), 0))
    

    If you want to use replace_na

    df %>% mutate_at(vars(-c(Date, thatCol)), tidyr::replace_na, 0)
    

    Note that mutate_at is soon going to be replaced by across in dplyr 1.0.0.