Search code examples
rdplyrtidyverseexpss

Recoding into a new categorical variable, where a category gets converted to NA


I am currently trying to recode variables with an NA but I am having problems.

I am having trouble and my data looks like this for an income variable:

Income1
0
1
1
2
2
0 

I wanted to remove the 0s and recode them into NAs. The 0s represent respondents who marked down 'choose not to answer'. I have tried this:

> Comm %>%
+   mutate(Income2 = case_when(Income1 = 0 ~ NA_real_,
+                              Income1 = 1 ~ 'Less than 50K'
+                              Income1 = 2 ~ 'More than 50K'))

but I keep getting this error:

Error in `mutate()`:
ℹ In argument: `Income2 = case_when(...)`.
Caused by error in `case_when()`:
! `0` must be a vector with type <logical>.
Instead, it has type <double>.

I tried converting Income1 as a logical but for whatever reason it's not working. So I tried using the expss package (SPSS like package). I wanted to retain the 1s and 2s.

Comm$Income2 = recode(Comm$Income1, "No answer" = 0 ~ NA_real_, 
                          "Less Than 50K" = 1 ~ 1, 
                          "More Than 50K" = 2 ~ 2)

That did not work because:

Error in process_recodings(x, unlist(list(...), recursive = TRUE), make_empty_vec(x),  : 
  'recode' - labelled recodings should recode into single not-NA value but we have: 0 ~ NA

Thank you for reading, any advice would help!


Solution

  • As for recode from expss. The error message said: " labelled recodings should recode into single not-NA value". So you need to remove the label from your recoding to NA. The reason for this is that label on NA (missing) value is not allowed both in SPSS and in expss. Code below works:

    Comm$Income2 = recode(Comm$Income1, 0 ~ NA_real_, 
                          "Less Than 50K" = 1 ~ 1, 
                          "More Than 50K" = 2 ~ 2)