collapse consecutive periods if the same

I tried to express my use case in R code as follows:

haves <- data.frame(
          id = c(1,1,1,1)
        , start = c(as.Date('2022-01-01'), as.Date('2022-02-01'), as.Date('2022-02-01'), as.Date('2022-02-16'))
        , end = c(as.Date('2022-01-31'), as.Date('2022-02-28'), as.Date('2022-02-15'), as.Date('2022-02-28'))
        , category = c("A", "B", "A", "A")
    )

haves

wants <- data.frame(
          id = c(1,1,1)
        , start = c(as.Date('2022-01-01'), as.Date('2022-02-01'), as.Date('2022-02-01'))
        , end = c(as.Date('2022-01-31'), as.Date('2022-02-28'), as.Date('2022-02-28'))
        , category = c("A", "B", "A")
    )

wants

Basically I want to collapse time periods (see last 2 columns in wants) if they are consecutive and contain the same category but also depended on id. Do you think this is possible? A group by id and category and then using min and max on start and end would not work.

Solution

Using rle, this should work:

r <- rle(haves$category)$l
haves %>% 
  mutate(cons = rep(seq(r), r)) %>% 
  group_by(id, category, cons) %>% 
  summarise(start = min(start),
            end = max(end))

# A tibble: 3 x 5
# Groups:   id, category [2]
     id category  cons start      end       
  <dbl> <chr>    <int> <date>     <date>    
1     1 A            1 2022-01-01 2022-01-31
2     1 A            3 2022-02-01 2022-02-28
3     1 B            2 2022-02-01 2022-02-28

Or with cons = data.table::rleid(category).