Search code examples
rconditional-statementsnamissing-datanonblank

"OR" condition in r with NA and blank


I have recently seen a problem that I don't understand. Here you have:

 x <- c(1,2,3,4,45,654,3,NA," ",8,5,64,54)

And the || condition in r is not working at my interest: to identify both NAs and blanks:

if(is.na(x) || x==" ") {...}

I am expecting the if function to return TRUE but it's FALSE instead. Can anybody please help me understand this issue here? Thanks!

Edit:

Sorry guys I meant to use the argument in a if statement so the length should be 1. | does not apply here.


Solution

  • Note that both sides of the || are vectors

    x <- c(1,2,3,4,45,654,3,NA," ",8,5,64,54)
    
    is.na(x)
    ## [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
    
    x == " "
    ## [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
    

    and if either side is a vector then it just uses its first element so

    is.na(x) || x == " "
    ## [1] FALSE
    

    is the same as:

    is.na(x)[1] || (x == " ")[1]
    ## [1] FALSE
    

    where both sides of || are FALSE in the above line.

    If you are trying to determine if any element is NA or " " then

    anyNA(x) || ( " " %in% x )
    ## [1] TRUE
    

    You could also use any(x == " ") for the right hand side but that only works because we know that the right hand side will not be executed if there are any NA values due to the left hand side and the fact that || never runs the right hand side of || if the left hand side is TRUE (short circuiting). If x were entirely NA then any(x == " ") would equal NA rather than TRUE or FALSE which makes the test for NA on the left hand side important.

    (Note that | is different. It does not short circuit. Both sides are evaluated before the | is performed.)