I am following some online tutorial to change a certain column (hyp) in nhanes dataset to a dummy variables with the package caret as follows:
library(mice)
library(caret)
set.seed(123)
imp <- mice(mice::nhanes, m=5)
df = complete(imp, action="long")
df$hyp = as.factor(as.character(df$hyp))
dummy <- dummyVars(formula = ~ hyp, data=df)
df <- data.frame(predict(dummy, newdata = df))
df
I only want to dummify the hyp variable but keep all the all variables in the dataset but I find that after data.frame(), df only keeps hyp.1 and hyp.2 The workaround I take is to save the df to csv and manually add the rest of the columns back to the data. It is quite tedious. I wonder if any way I could dummify the dataset while keeping all non-dummy varibles in the data. Thanks.
You can do this easily without the caret
package. For example:
library(dplyr)
library(mice)
imp <- mice(mice::nhanes, m=5)
df <- complete(imp, action="long")
df <- df %>%
mutate(hyp1 = 2 - hyp,
hyp2 = hyp - 1) %>%
select(-hyp)
or using Base R:
df$hyp.1 <- 2 - df$hyp
df$hyp.2 <- df$hyp - 1
df[, !colnames(df) %in% "hyp"]