I have a vector of strings:
ve <- c("N","A","A","A","N","ANN","NA","NFNFNAA","23","N","A","NN", "parnot", "important", "notall")
I want to keep only three possible values in this vector: N
, A
, and NA
.
Therefore, I want to replace any element that is NOT N
or A
with NA
.
How can I achieve this?
I have tried the following:
gsub(ve, pattern = '[^NA]+', replacement = 'NA')
gsub(ve, pattern = '[^N|^A]+', replacement = 'NA')
But these don't work well, because they replace every instance of "A" or "N" in every string with NA. So in some cases I end up with NANANANANANA
, instead of simply NA
.
If we are looking for fixed matches, then use %in%
with negation !
and assign it to 'NA'
ve[!ve %in% c("A", "N", "NA")] <- 'NA'
Note that in R
, missing value is unquoted NA
and not quoted. Hope it is a different category and would advise to change the category name to different name to avoid future confusions while parsing