Search code examples
sqlmysqlgaps-and-islands

Matching records between EventA and the first EventB before the next EventA, in a specific order


I have the following data (fiddle),

id datec event
1 2022-09-19 12:16:38 EVENTA
2 2022-09-19 12:16:38 A
3 2022-09-19 12:21:08 B
4 2022-09-19 12:21:12 EVENTD
5 2022-09-19 12:25:18 C
6 2022-09-19 12:25:18 D
7 2022-09-19 12:25:28 E
8 2022-09-19 12:25:29 F
9 2022-09-19 12:25:38 EVENTA
10 2022-09-19 12:25:39 G
11 2022-09-19 12:25:40 H
12 2022-09-19 12:25:48 I
13 2022-09-19 12:27:18 EVENTD
14 2022-09-19 12:29:08 J

I can't figure out how to select values between two others, but in a specific order. Only events between EVENTA and EVENTD should be returned, in that order.

So that results should be the rows with id 1 to 4 and 9 to 13

Tried to do something like the following, but it is giving me id 1,4,9 and 13 omitting what is between them.

SELECT id, datec, event 
FROM table1 
WHERE event BETWEEN 'EVENTA' AND 'EVENTD';

I then tried to use this,

SELECT id, datec, event 
FROM table1 
WHERE (id BETWEEN (SELECT id 
                   FROM table1 
                   WHERE event BETWEEN 'EVENTA' AND 'EVENTD' 
                   LIMIT 1) 
              AND (SELECT id 
                   FROM table1 
                   WHERE event BETWEEN 'EVENTA' AND 'EVENTD' 
                   LIMIT 1,1)) 
   OR (id BETWEEN (SELECT id 
                   FROM table1 
                   WHERE event BETWEEN 'EVENTA' AND 'EVENTD' 
                   LIMIT 2,1) 
              AND (SELECT id 
                   FROM table1 
                   WHERE event BETWEEN 'EVENTA' AND 'EVENTD' LIMIT 3,1));

And it gives me the results but I have many rows in my table.

Can please someone guide me on how to repeat this till the end as i'm sure there is a way to do this but i can't figure out how?

Regards,

pierre


Solution

  • Here's one approach:

    • compute running counts of armed events and disarmed events, ordering by date
    • compute a ranking order of records for each armed event count, by ordering on the amount of disarmed events

    At this point you should note that this ranking value we generated, assumes value 0 when there's not yet an EventD in our armed_event partition. And it gets value 1 when the first EventD is found, till the successive EventD.

    So you can just filter accordingly inside a WHERE clause, when this ranking value is either 0 or is 1 and event is exactly "EventD".

    WITH cte AS (
        SELECT *, SUM(`event`='EVENTA') OVER(ORDER BY datec, id) AS armed_events,
                  SUM(`event`='EVENTD') OVER(ORDER BY datec, id) AS disarmed_events
        FROM Table1
    ), cte2 AS (
        SELECT *, DENSE_RANK() OVER(PARTITION BY armed_events ORDER BY disarmed_events) -1 AS rn
        FROM cte
    )
    SELECT `id`, `datec`, `event` 
    FROM cte2
    WHERE rn = 0 OR (rn = 1 AND `event` = 'EVENTD')
    ORDER BY id
    

    Output:

    id datec event
    1 2022-09-19 12:16:38 EVENTA
    2 2022-09-19 12:16:38 A
    3 2022-09-19 12:21:08 B
    4 2022-09-19 12:21:12 EVENTD
    9 2022-09-19 12:25:38 EVENTA
    10 2022-09-19 12:25:39 G
    11 2022-09-19 12:25:40 H
    12 2022-09-19 12:25:48 I
    13 2022-09-19 12:27:18 EVENTD

    Check the demo here.

    Note: The last ORDER BY clause is not necessary. It's there just for visualization purposes.