I want to replace values in a factor variable depending on another column, while not changing the initial factor levels.
Example:
x <- structure(list(Payee = structure(c(NA, 1L, 2L),
.Label = c("0", "x"), class = "factor"), PayeeID_Hash = structure(c(NA, 1L,2L),
.Label = c("0x31BCA02","0xB672841"), class = "factor")),
row.names = c(NA,"tbl", "data.frame"))
> x
# A tibble: 3 x 2
Payee PayeeID_Hash
<fct> <fct>
1 NA NA
2 0 0x31BCA02
3 x 0xB672841
When Payee
is '0', then the corresponding PayeeID_Hash
value should not exist (i.e. it should be NA
). Please note that I do not want to drop the factor level 0x31BCA02
(it will be present in other rows where Payee
has level x
). Also, I want to keep the PayeeID_Hash
levels as they are (I do not want to replace them with other values).
Expected output:
> x
# A tibble: 3 x 2
Payee PayeeID_Hash
<fct> <fct>
1 NA NA
2 0 NA
3 x 0xB672841
I could do this by transforming factor to character and then back to factor as:
x %>%
mutate(PayeeID_Hash = as.character(PayeeID_Hash),
PayeeID_Hash = ifelse(Payee == "0", NA_character_, PayeeID_Hash),
PayeeID_Hash = as.factor(PayeeID_Hash))
Is there another cleaner (i.e. more straight forward) way to do this?
We can use replace
and avoid the step 2 and 4. It would keep the factor
column as such and doesn't coerce factor
to integer
(unless converted to character
class) as in ifelse
library(dplyr)
x %>%
mutate(PayeeID_Hash = droplevels(replace(PayeeID_Hash, Payee == "0", NA)))
# A tibble: 3 x 2
# Payee PayeeID_Hash
# <fct> <fct>
#1 <NA> <NA>
#2 0 <NA>
#3 x 0xB672841