Search code examples
rdummy-variable

Indicator Variable for Missing Values


I need to create a new column in my wbpol dataset named wbpol$missing.

This column will display a 1 if there is a NA in any of the other columns for that row and 0 if there are no NA's in a the other columns of the row.

This is my current code:

wbpol$missing<-ifelse(apply(wbpol, 1, anyNA), TRUE == 1, FALSE == 0)

When I run the code, however, all I get is wbpol$missing to show "TRUE". I need it to say 1 if there is a NA in the other rows and 0 if there is not.

How do I do this?


Solution

  • For the ifelse statement, the second and third parameter should be what you want the values to be assigned if the statement in the first parameter is true or false, respectively.

    In this case, you have set the expression TRUE == 1 to be evaluated in the case that the statement is true, and the expression FALSE == 0 to be evaluated in the case the statement is false. However both TRUE == 1 and FALSE == 0 evaluate to TRUE, which is why your column is filled with TRUE. You can see this if you enter TRUE == 1 or FALSE == 0 in the R console.

    Instead, simply indicate that you want the values 1 and 0 to be returned if the statement is true or false, respectively. For example the following will return 1 if the statement is true, and 0 if the statement is false:

    wbpol$missing<-ifelse(apply(wbpol, 1, anyNA), 1, 0)