I have a database with patient id number and the treatment they recived. I would like to have a dummy column for every different INDIVIDUAL treatment (ie, as in did the patient recieve treatment A,B,C,D).
This is way simplified because I have over 20 treatments and thousands of patients, and I can't figure out a simple way to do so.
example <- data.frame(id_number = c(0, 1, 2, 3, 4),
treatment = c("A", "A+B+C+D", "C+B", "B+A", "C"))
I would like to have something like this:
desired_result <- data.frame(id_number = c(0, 1, 2, 3, 4),
treatment = c("A", "A+B+C+D", "C+B", "B+A","C"),
A=c(1,1,0,1,0),
B=c(0,1,1,1,0),
C=c(0,1,1,0,1),
D=c(0,1,0,0,0))
One tidyverse
possibility could be:
example %>%
mutate(treatment2 = strsplit(treatment, "+", fixed = TRUE)) %>%
unnest() %>%
spread(treatment2, treatment2) %>%
mutate_at(vars(-id_number, -treatment), ~ (!is.na(.)) * 1)
id_number treatment A B C D
1 0 A 1 0 0 0
2 1 A+B+C+D 1 1 1 1
3 2 C+B 0 1 1 0
4 3 B+A 1 1 0 0
5 4 C 0 0 1 0
Or:
example %>%
mutate(treatment2 = strsplit(treatment, "+", fixed = TRUE)) %>%
unnest() %>%
mutate(val = 1) %>%
spread(treatment2, val, fill = 0)