Search code examples
rdplyrr-factor

Make a new variable with selected levels of another variable


I'm having trouble with creating a new variable with selected levels of another variable. The data set is gss and the variable is class which has 5 levels "Lower Class" "Working Class" "Middle Class" "Upper Class" "No Class" and NA

If I run,

gss %>% 
select(class) %>%
str()

It gives me

'data.frame':   57061 obs. of  1 variable:
$ class: Factor w/ 5 levels "Lower Class",..: 3 3 2 3 2 3 3 2 2 2 ...

Since I am only interested in those who specified their economic class, I would like to take out "No Class" level and NA. I do not know any better way to do this so I did

gss <- gss %>%
mutate(filteredclass = ifelse(class == "Lower Class", "Lower Class", 
ifelse(class == "Working Class", "Working Class", ifelse(class == "Middle 
Class", "Middle Class", ifelse(class == "Upper Class", "Upper Class", NA)))))

Then, I tried to see whether it worked or not, so I ran:

with (gss, table(filteredclass))

Which then gave me with mixed order as below:

filteredclass
Lower Class  Middle Class   Upper Class Working Class 
     3147         24289          1741         24458

I would want the new variable filteredclass to be shown as the same order as the variable 'class'. Since if I do the same with the variable 'class' it gives me:

with (gss, table(class))
class
Lower Class Working Class  Middle Class   Upper Class 
     3147         24458         24289          1741 
 No Class 
        1 

Is there any way I can fix this? Or also, is there any way I can take out No Class level without going through mutate command I did above?

Thanks for your help in advance!


Solution

  • Easiest way could to be factor on class as:

    gss$filteredclass <- factor(gss$class, c("Lower Class", "Working Class",
                                 "Middle Class", "Upper Class"))
    

    This will omit "No class" and set it as NA.