Transform a table based on date overlap

Problem statement

Hi there, I'm a beginner SQL user. I have an access to a database for research, there's one table of interest structured as follow:

`id`	`date`	`use`	`item`
1	2023-02-01	90	A
1	2023-03-01	10	B
1	2023-03-15	15	C
2	2023-02-05	10	B
2	2023-02-13	30	A

This is a table of item acquisition date done by a personnel id. I want to transform this table based on the overlap of date for each personnel id, but not quite sure of how to achieve that.

Desired output

I imagine the resulting table would be as follow:

`id`	`start`	`end`	`item`
1	1	29	A
1	30	39	A, B
1	40	43	A
1	44	58	A, C
1	59	90	A
2	1	8	B
2	9	10	A, B
2	11	38	A

Rough idea

My idea here is to:

I want to transform the table using date as a starting index for each personnel id
This way, we can transform the transformed date to use so that we obtain the relative ending date of uses as start + use - 1

id start use end item

1 1 90 90 A

1 30 10 39 B

1 44 15 58 C

2 1 10 10 B

2 9 30 38 A

`id`	`start`	`use`	`end`	`item`
1	1	90	90	A
1	30	10	39	B
1	44	15	58	C
2	1	10	10	B
2	9	30	38	A

Visually, the item acquisition journey for each personnel would be:

Personnel 1:

                                              +--------------C (15)  
                               +---------B (10)  
+------------------------------------------------------------------------------------------A (90)  
+------------------------------+--------------+-------------------------------------------->  
1                              30             44                                         90

Personnel 2:

        +--------------------------------A (30)  
+---------B (10)  
+-------+-------------------------------->  
1       9                              40

Finally, I can evaluate concurrent item acquisition as follow:

id start end item

1 1 29 A

1 30 39 A, B

1 40 43 A

1 44 58 A, C

1 59 90 A

2 1 8 B

2 9 10 A, B

2 11 38 A

What I found

I found some relevant topics to this question, including:

What I tried

-- Create a dummy table
CREATE TABLE tbl (
  id int,
  start_date date,
  day_of_use int,
  item varchar(8)
)
;

-- Populating table with sample data
INSERT INTO tbl VALUES (1, "2023-02-01", 90, "A");
INSERT INTO tbl VALUES (1, "2023-03-02", 10, "B");
INSERT INTO tbl VALUES (1, "2023-03-15", 15, "C");
INSERT INTO tbl VALUES (2, "2023-02-05", 10, "B");
INSERT INTO tbl VALUES (2, "2023-02-13", 30, "A");

-- Querying the overlap
SELECT
  a.id,
  DATEDIFF(a.start_date, b.init) + 1 AS start,
  day_of_use,
  DATEDIFF(a.start_date, b.init) + day_of_use AS end,
  item
FROM
  tbl AS a
  LEFT JOIN
  (
    SELECT id, MIN(start_date) AS init
    FROM tbl
    GROUP BY id
    ORDER BY id
  ) AS b ON a.id = b.id
;

Though, I'm not sure how to proceed from here.. Any idea would be appreciated, thanks in advance. SQL Fiddle: https://www.db-fiddle.com/f/eB1GnZTuuH5P71kr4pWqHu/2

Solution

If you're using MySQL 8.0, you could try:

unrolling your values with a recursive cte
treat your problem as a gaps-and-islands problem
aggregate to gather your values

In order to address the gaps and islands problem, you may want to recreate your partitions using a running sum over the amount of items change in consecutive records.

WITH RECURSIVE cte AS (
    SELECT id, start_date AS date_, day_of_use, item
    FROM tbl
  
    UNION ALL 
  
    SELECT id, DATE_ADD(date_, INTERVAL 1 DAY), day_of_use-1, item  
    FROM cte
    WHERE day_of_use > 0
), cte2 AS (
    SELECT id, 
           DENSE_RANK() OVER(PARTITION BY id ORDER BY date_) AS rn, 
           item
    FROM cte
), cte3 AS (
    SELECT id, 
           rn,
           GROUP_CONCAT(item ORDER BY item) AS items,
           LAG(GROUP_CONCAT(item ORDER BY item)) OVER(PARTITION BY id) AS prev_items
    FROM cte2
    GROUP BY id, rn
), cte4 AS (
    SELECT id, rn, items,
           COUNT(CASE WHEN prev_items != items THEN 1 END) OVER(PARTITION BY id ORDER BY rn) AS parts
    FROM cte3 
)
SELECT id, MIN(rn) AS start_, MAX(rn) AS end_, items 
FROM cte4
GROUP BY id, items, parts

"Output":

id	items	start_	end_
1	A	1	29
1	A,B	30	40
1	A	41	42
1	A,C	43	58
1	A	59	91
2	B	1	8
2	A,B	9	11
2	A	12	39

Check the demo here.

Note that February 2023 doesn't have 29 days.