I have a data set of teachers as follows:
df <- data.frame(
teacher = c("A", "A", "A", "A", "B", "B", "C", 'C'),
seg = c("1", '1', "2", "2", "1", "2", "1", "2"),
claim = c(
"beth",
'john',
'john',
'beth',
'summer',
'summer',
"hannah",
"hannah"
)
)
I would ideally like to spread my dataset like this:
Desired output.
Any ideas for how I can use either spread or pivot_wide to achieve this? The issue is that there are two grouping variables here (teacher and segment). Some teachers may have multiple of the same segment, but some teachers don't.
One option would be to create a sequence column grouped by 'teacher', 'seg', and then use pivot_wider
library(dplyr)
library(tidyr)
library(stringr)
df %>%
group_by(teacher, seg) %>%
mutate(segN = c("", "double")[row_number()]) %>%
ungroup %>%
mutate(seg = str_c("seg", seg, segN)) %>%
select(-segN) %>%
pivot_wider(names_from = seg, values_from = claim)
# A tibble: 3 x 5
# teacher seg1 seg1double seg2 seg2double
# <fct> <fct> <fct> <fct> <fct>
#1 A beth john john beth
#2 B summer <NA> summer <NA>
#3 C hannah <NA> hannah <NA>
It can be simplified with rowid
from data.table
library(data.table)
df %>%
mutate(seg = str_c('seg', c('', '_double')[rowid(teacher, seg)], seg)) %>%
pivot_wider(names_from = seg, values_from = claim)
#or use spread
# spread(seg, claim)
# teacher seg1 seg_double1 seg2 seg_double2
#1 A beth john john beth
#2 B summer <NA> summer <NA>
#3 C hannah <NA> hannah <NA>