Search code examples
rselectna.when

Selecting Specific Data in an Data Frame to replace using the Row and Column names


I am attempting to replace specific NA values with 0 in my data table. I do not want all NAs replaces, only those under certain conditions. For example, "replace NA with Zeros when the row is Cole_1 and the Column includes the designation 'Fall1'". I have a huge data set, so I need as little manual designating as possible, numbering each column is not an option. Basically, I want to be able to target the cells like playing battleship.

I have tried:

whentest <- count_order_site %>% 
  when(select(contains("Fall1")) & 
  count_order_site[count_order_site$Point_Name == "Cole_1", ], 
  count_order_site[is.na(count_order_site)] <- 0 )  

but get an error "contains() must be used within a selecting function." I'm not even sure if this is the right path to get what I want.

The basic layout idea (Sorry it's stacked weird, I can't figure out how to make them next to each other):

Point Name ACWO_Fall1
Cole_1 NA
Cole_2 3
ACWO_FAll2 HOSP_FAll1
3 NA
NA 5

After the functions the data would look like:

Point Name ACWO_Fall1
Cole_1 0
Cole_2 3
ACWO_FAll2 HOSP_FAll1
3 0
NA 5

Solution

  • If I understand correctly, you can use mutate across to include columns that contain certain character values, such as "Fall1". Then, with the replace function, replace those values that are missing using is.na and where the point_name has a specific value, such as "Cole_1".

    The example below has a couple extra columns to demonstrate if the logic is correct.

    library(tidyverse)
    
    df %>%
      mutate(across(contains("Fall1"), ~replace(., is.na(.) & point_name == "Cole_1", 0)))
    

    Output

      point_name ACWO_Fall1 ACWO_Fall2 HOSP_Fall1 Other1 Other_Fall1
    1     Cole_1          0          3          0     NA           6
    2     Cole_2          3         NA          5     NA          NA