I have data in the following form. There are more than a million rows. I just want to create another column which helps me in row identification of grouped Item3 . The first two columns are irrelevant. Just added to let know that I have other columns in the dataset. I used cumsum and group_indices but didn't work.
Item1 | Item2 | Item3 |
---|---|---|
One | Two | A |
One | Two | A |
One | Two | A |
One | Two | B |
One | Two | B |
One | Two | C |
Item1 | Item2 | Item3 | Identifier |
---|---|---|---|
One | Two | A | 1 |
One | Two | A | 2 |
One | Two | A | 3 |
One | Two | B | 1 |
One | Two | B | 2 |
One | Two | C | 1 |
library(tidyverse)
data <- tibble(
Item1 = c("One", "One", "One", "One", "One", "One"),
Item2 = c("Two", "Two", "Two", "Two", "Two", "Two"),
Item3 = c("A", "A", "A", "B", "B", "C")
)
data %>%
mutate(ID = row_number(), .by = Item3))
Item1 Item2 Item3 ID
<chr> <chr> <chr> <int>
1 One Two A 1
2 One Two A 2
3 One Two A 3
4 One Two B 1
5 One Two B 2
6 One Two C 1
Credit to Chamkrai for the .by = Item3
idea 😃