I'm using R and I'm really at a loss right now. I have data like this:
df <- data.frame(
group = c(2, 2, 2, 1, 1, 0, 0, 1, 1, 0, 1, 0),
grade = c(2, 4, 3, 1, 3, 2, 5, 1, 1, 2, 3, 1)
)
I want to have it like this:
group0 group1 group2
1 1 3 0
2 2 0 1
3 0 2 1
4 0 0 1
5 1 0 0
6 0 0 0
I've been trying for hours using subset, tapply, table, for loops and what not but I can't seem to figure it out. I'd be really happy if someone could help me, I can't help but think I'm missing something really easy and obvious.
How can I produce my target output?
/ Solved, see below. Thanks for finding a fitting title btw, you guys are the best!
You can do something like this with dplyr
and tidyr
:
df %>%
count(group, grade) %>%
mutate(group = paste0('group', group)) %>%
spread(group, n, fill = 0)
# A tibble: 5 x 4
grade group0 group1 group2
* <int> <dbl> <dbl> <dbl>
1 1 1 3 0
2 2 2 0 1
3 3 0 2 1
4 4 0 0 1
5 5 1 0 0
If you don't want the additional 'grade' column, you can do:
df %>%
count(group, grade) %>%
mutate(group = paste0('group', group)) %>%
spread(group, n, fill = 0) %>%
select(-grade)
group0 group1 group2
* <dbl> <dbl> <dbl>
1 1 3 0
2 2 0 1
3 0 2 1
4 0 0 1
5 1 0 0