I have the following dataframe :
D <- data.frame("Id" = c("a","b","c","d","e","f","g"), "Group" = c("1","1","1","2","2","2","2"),"Time" = c("1","1","2","1","2","3","3"))
Id Group Time
1 a 1 1
2 b 1 1
3 c 1 2
4 d 2 1
5 e 2 2
6 f 2 3
7 g 2 3
I want to count the number of individuals by period and time keeping a cylinder structure. The classical way to do it uses dplyr
D %>% group_by(Group,Time) %>% tally()
Group Time n
<fct> <fct> <int>
1 1 1 2
2 1 2 1
3 2 1 1
4 2 2 1
5 2 3 2
but structure is not balanced : time 3 doesn't show up for group 1 here but I'd like to see it associated with a 0, like this:
Group Time n
<fct> <fct> <int>
1 1 1 2
2 1 2 1
3 1 3 0
4 2 1 1
5 2 2 1
6 2 3 2
Is-there a way to balance the results after a group_by ? Has someone faced something similar ? Thanks in advance
Since Time
is factor variable we can use count
with .drop = FALSE
as by default count
drops observation with 0 counts.
library(dplyr)
D %>% count(Group, Time, .drop = FALSE)
# Group Time n
# <fct> <fct> <int>
#1 1 1 2
#2 1 2 1
#3 1 3 0
#4 2 1 1
#5 2 2 1
#6 2 3 2
We can use the same method using tally
as well.
D %>% group_by(Group,Time, .drop = FALSE) %>% tally()
Or with complete
D %>% count(Group, Time) %>% tidyr::complete(Group, Time, fill = list(n = 0))