I have a dataset with 10 columns. One of those columns is the date. I want to create dummy variables for every month. How do I go about doing this?
Date Col1 Col2
2017-01-09 v 2
2017-05-01 s 7
2018-03-02 k 9
I can extract the month using lubridate:
df$MONTH<-month(df$Date)
Date Col1 Col2 MONTH
2017-01-09 v 2 1
2017-05-01 s 7 5
2018-03-02 k 9 3
How do I transform this to have the dummy variables for each month cbinded to the original?
Date Col1 Col2 M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12
2017-01-09 v 2 1 0 0 0 0 0 0 0 0 0 0 0
2017-05-01 s 7 0 0 0 0 1 0 0 0 0 0 0 0
2018-03-02 k 9 0 0 1 0 0 0 0 0 0 0 0 0
One option is tabulate
on ther 'MONTH' and create the columns
df[paste0("M", 1:12)] <- as.data.frame(t(sapply(df$MONTH, tabulate, 12)))
Or use row/column
indexing where the column index is taken from the 'MONTH' and assign those values from a matrix
of 0's to 1
m1 <- matrix(0, nrow(df), 12)
m1[cbind(seq_len(nrow(df)), df$MONTH)] <- 1
df[paste0("M", 1:12)] <- m1
df
# Date Col1 Col2 MONTH M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12
#1 2017-01-09 v 2 1 1 0 0 0 0 0 0 0 0 0 0 0
#2 2017-05-01 s 7 5 0 0 0 0 1 0 0 0 0 0 0 0
#3 2018-03-02 k 9 3 0 0 1 0 0 0 0 0 0 0 0 0