Search code examples
rdataframereplacewildcardna

Replace with NA in the whole dataframe using wildcards


I have a dataframe df like this

Col1 Col2 Col3
1 2 5
6 x1A 9
8 x3 7
5 3 x4Z

I want to replace all values starting with X by NA. The result should be

Col1 Col2 Col3
1 2 5
6 NA 9
8 NA 7
5 3 NA

I tried different solutions with no success:

df[-grep(pattern = "X^",df),]    
df %>% mutate_all(~na_if(.,"X*"))    
df%>% mutate_all(~na_if(.,matches("X*")))    
df %>% mutate_all(~na_if(.,glob2rx("X*")))

Solution

  • The problem is that according to the na_if help page, na_if(x, y) only replaces values if x and y are exactly equal. There is no partial matching or support for boolean values.

    You could use if_else instead

    df %>% mutate_all(~if_else(grepl("^X", .), NA, .))
    

    Note that this will not change the class of the column. If you read in data that had an "X" that data would have been read in as a character vector. So replacing those values with NA will still leave the column as a character vector.