I want to be able to efficiently recode the factor levels of a large number of variables (columns) of a data frame by replacing one of the levels with the name of the variable (column).
Health <- tibble(Anemia = c("yes", "no", "no"),
BloodPressure = c("no", "yes", "no"),
Asthma = c("no", "no", "yes"))
And I want to the output to look like this
Health2 <- tibble(Anemia = c("Anemia", "no", "no"),
BloodPressure = c("no", "BloodPressure", "no"),
Asthmal = c("no", "no", "Asthma"))
I want this output without changing each level by hand because I have a database with 100 or so variables that I have to recode. I tried to create a function to do this
Med_rename <- function(x) {
levels = c(no = "no", names(x) ="yes")
fct_recode(x, !!!levels)
}
Med_rename2 <- function(x) {
y = names(x)
levels = c(no = "no", y ="yes")
fct_recode(x, !!!levels)
}
but the output of either of these attempts or others using vectorized attempts to replace the levels does not replace "yes" with variable (column) name. Is there another vectorized way to replace the "yes" with a column name and apply to large set of variables?
You can use cur_column()
in dplyr
to use the name of the column to replace.
library(dplyr)
Health %>% mutate(across(.fns = ~replace(., . == 'yes', cur_column())))
# Anemia BloodPressure Asthma
# <chr> <chr> <chr>
#1 Anemia no no
#2 no BloodPressure no
#3 no no Asthma
In base R, with lapply
:
Health[] <- lapply(names(Health), function(x)
replace(Health[[x]], Health[[x]] == 'yes', x))