Set variable values to missing in R and drop unused levels

I have a data set, DATA, with a variable, VAR. This variables mode is numeric, and its class is a factor. It represents gender. When printed out, it looks something like below

 VAR
  M
  M
  F
  U

  M

When I print out levels, it outputs: "" "F" "M" "U", and a frequency table looks like this:

     F     M     U
 2   30    25    1

What I want to do is change everything that is not "F" or "M" to be a missing values, then label them "Man" and "Woman", and drop unused levels for the variable (but still leave a level for missing). So far I have the code below:

DATA$VAR[DATA$VAR == "U" | DATA$VAR == ""] <- NA

But I got the exact same values for the levels, and now the frequency table looks like this:

     F     M     U
 0   30    25    0

I feel like I'm close, but not quite there. I don't understand how to deal with the level issues. Any help is greatly appreciated.

Solution

To create a factor where everything bar what was M and F become missing use levels within a call to factor. To relabel these use the labels argument

a <-  factor(c("M","M","F","U","","M"))

a2 <- factor(a, levels = c('M','F'), labels =c('Male','Female'))

a2
# [1] Male   Male   Female <NA>   <NA>   Male  
# Levels: Male Female

If you want to tally NA values in table, set useNA = 'always' or useNA='ifany'

table(a2, useNA = 'ifany')
##   a2
##   Male Female   <NA> 
##     3      1      2