Search code examples
rdummy-variabler-mice

how to extend the data frame with dummy variable with dummyVars package?


I am following some online tutorial to change a certain column (hyp) in nhanes dataset to a dummy variables with the package caret as follows:

library(mice)
library(caret)
set.seed(123)

imp <- mice(mice::nhanes, m=5)
df = complete(imp, action="long")

df$hyp = as.factor(as.character(df$hyp))
dummy <- dummyVars(formula = ~ hyp, data=df)
df <- data.frame(predict(dummy, newdata = df))
df

I only want to dummify the hyp variable but keep all the all variables in the dataset but I find that after data.frame(), df only keeps hyp.1 and hyp.2 The workaround I take is to save the df to csv and manually add the rest of the columns back to the data. It is quite tedious. I wonder if any way I could dummify the dataset while keeping all non-dummy varibles in the data. Thanks.


Solution

  • You can do this easily without the caret package. For example:

    library(dplyr)
    library(mice)
    
    imp <- mice(mice::nhanes, m=5)
    df <- complete(imp, action="long")
    
    df <- df %>%
      mutate(hyp1 = 2 - hyp,
             hyp2 = hyp - 1) %>%
      select(-hyp)
    

    or using Base R:

    df$hyp.1 <- 2 - df$hyp
    df$hyp.2 <- df$hyp - 1
    df[, !colnames(df) %in% "hyp"]