Search code examples
rdplyrcasedata-manipulation

case_when with three conditions update NA rows


I am populating a column based on other columns. The idea is: If column Maturity is NA (other values already filled based on tissue analysis), and if female/male with certain size put either Mature or Immature.

Therefore I have the following code:

data <- data %>%
  mutate(Sexual.Maturity = case_when(
    (Sexual.Maturity==NA & Sex== "M" & Length >= 206.49) ~ "Mature",
    (Sexual.Maturity==NA & Sex== "M" & Length < 206.49) ~ "Immature",
    (Sexual.Maturity==NA & Sex== "F" & Length >= 188.8 ) ~ "Mature",
    (Sexual.Maturity==NA & Sex== "F" & Length < 188.8 ) ~ "Immature",    
    TRUE ~ NA_character_
    
  ))

Unfortunately it fills all my column with NAs.. I have also tried to use this instead, but it overwrote my existing values wrong...

data <- data %>%
  mutate(Sexual.Maturity = case_when(
    (Sexual.Maturity==NA | Sex== "M" | Length >= 206.49) ~ "Mature",
    (Sexual.Maturity==NA | Sex== "M" | Length < 206.49) ~ "Immature",
    (Sexual.Maturity==NA | Sex== "F" | Length >= 188.8 ) ~ "Mature",
    (Sexual.Maturity==NA | Sex== "F" | Length < 188.8 ) ~ "Immature",    
    TRUE ~ NA_character_
    
  ))

I am happy to try another way, not using case_when, but R base for example, if that's more handy with 3 conditions.


Solution

  • We can't compare NAs, use is.na() instead, also if the conditions are not met use existing .default value:

    data <- data %>%
      mutate(Sexual.Maturity = case_when(
        is.na(Sexual.Maturity) & Sex == "M" & Length >= 206.49 ~ "Mature",
        is.na(Sexual.Maturity) & Sex == "M" & Length < 206.49 ~ "Immature",
        is.na(Sexual.Maturity) & Sex == "F" & Length >= 188.8 ~ "Mature",
        is.na(Sexual.Maturity) & Sex == "F" & Length < 188.8 ~ "Immature",    
        .default = Sexual.Maturity))