Search code examples
rdataframedplyrgroup

Assign rows to groups of 3 in data frame?


I have the following data frame:

id
A     
A     
A     
A     
A
B     
B     
B     
B     
B     
B     
C    
C     

I want to create another column called "Group.3". In this column, the number of rows per group are counted in threes. 3 consecutive rows are assigned the same group number, and the next 3 or less rows (depending if we're reaching the end of the group) are assigned another higher number.

This is what I would like the column to look like:

id  Group.3
A     1
A     1
A     1
A     2
A     2
B     1
B     1
B     1
B     2
B     2
B     2
C     1
C     1

Any advice on how I can do this, preferably using the functions group_by and mutate in the package dplyr?


Solution

  • Using group_by and mutate with dplyr, we can accomplish this by finding the row number within each group, and then using modulo to group row numbers in groups of 3:

    library(dplyr)
    df <- data.frame("id"=c(rep("A",5), rep("B", 6), rep("C",2))
    > df %>% group_by(id) %>% mutate(Group.3 = (row_number()-1)%/%3+1)
    # A tibble: 13 × 2
    # Groups:   id [3]
       id    Group.3
       <chr>   <dbl>
     1 A           1
     2 A           1
     3 A           1
     4 A           2
     5 A           2
     6 B           1
     7 B           1
     8 B           1
     9 B           2
    10 B           2
    11 B           2
    12 C           1
    13 C           1
    

    By the way, it can be really helpful to post a reproducible example so that the community can quickly load your data in and begin answering your question.