Search code examples
sqloraclesql-order-bygroup

Oracle SQL Grouping same value records in particular order


I have simple table of Time and Task columns where it is ordered by Time and has associated task to it as per below example. Question: How to group the tasks and get the earliest time of every time the task is changing?

SELECT 
    "Time",
    "Task"
FROM my_query
ORDER BY
    "Time"
Time Task
06:06:41 Repack
06:06:44 Vehicle Load
06:06:45 Vehicle Load
06:06:46 Vehicle Load
06:10:47 Repack
06:10:48 Vehicle Load
06:10:51 Vehicle Load
06:10:52 Vehicle Load
06:15:45 Repack
06:16:13 Repack
06:24:30 Repack
06:24:34 Vehicle Load
06:26:11 Repack
06:26:12 Vehicle Load
06:26:13 Vehicle Load
06:29:33 Relocate
06:46:07 Repack
06:46:45 Repack
07:01:54 Pick
07:04:12 Pick
07:04:26 Marshal
07:06:28 Pick
07:17:06 Repack
07:17:10 Vehicle Load
07:17:11 Vehicle Load
07:21:55 Repack

The output is required:

Time Task
06:06:41 Repack
06:06:44 Vehicle Load
06:10:47 Repack
06:10:48 Vehicle Load
06:15:45 Repack

and so on...

I have tried DENSE_RANK(), ROW_NUMBER() functions without any luck.. Please help


Solution

  • You can use the LAG analytic function to get the previous Task and then find when that changes:

    SELECT Time,
           Task
    FROM   (
      SELECT Time,
             Task,
             LAG(task) OVER (ORDER BY time) AS prev_task
      FROM my_query
    )
    WHERE  task != prev_task
    OR     prev_task IS NULL
    

    Or, from Oracle 12, you can use MATCH_RECOGINZE to do row-by-row pattern matching:

    SELECT time, Task
    FROM   my_query
           MATCH_RECOGNIZE(
             ORDER BY time
             ALL ROWS PER MATCH
             PATTERN (not_same)
             DEFINE not_same AS task != PREV(task) OR PREV(task) IS NULL
           )
    

    Or, if you did want to use ROW_NUMBER then you can use:

    SELECT time,
           Task
    FROM   (
      SELECT time,
             Task,
             ROW_NUMBER() OVER (PARTITION BY task, grp ORDER BY time) AS rn
      FROM   (
        SELECT Time,
               Task,
               ROW_NUMBER() OVER (ORDER BY time)
                 - ROW_NUMBER() OVER (PARTITION BY task ORDER BY time) AS grp
        FROM   my_query
      )
    )
    WHERE rn = 1
    ORDER BY time
    

    (But the first two are probably going to be more efficient.)

    Which, for the sample data:

    CREATE TABLE my_query (Time, Task) AS
    SELECT INTERVAL '06:06:41' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
    SELECT INTERVAL '06:06:44' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
    SELECT INTERVAL '06:06:45' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
    SELECT INTERVAL '06:06:46' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
    SELECT INTERVAL '06:10:47' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
    SELECT INTERVAL '06:10:48' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
    SELECT INTERVAL '06:10:51' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
    SELECT INTERVAL '06:10:52' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
    SELECT INTERVAL '06:15:45' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
    SELECT INTERVAL '06:16:13' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
    SELECT INTERVAL '06:24:30' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
    SELECT INTERVAL '06:24:34' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
    SELECT INTERVAL '06:26:11' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
    SELECT INTERVAL '06:26:12' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
    SELECT INTERVAL '06:26:13' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
    SELECT INTERVAL '06:29:33' HOUR TO SECOND, 'Relocate'     FROM DUAL UNION ALL
    SELECT INTERVAL '06:46:07' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
    SELECT INTERVAL '06:46:45' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
    SELECT INTERVAL '07:01:54' HOUR TO SECOND, 'Pick'         FROM DUAL UNION ALL
    SELECT INTERVAL '07:04:12' HOUR TO SECOND, 'Pick'         FROM DUAL UNION ALL
    SELECT INTERVAL '07:04:26' HOUR TO SECOND, 'Marshal'      FROM DUAL UNION ALL
    SELECT INTERVAL '07:06:28' HOUR TO SECOND, 'Pick'         FROM DUAL UNION ALL
    SELECT INTERVAL '07:17:06' HOUR TO SECOND, 'Repack'       FROM DUAL UNION ALL
    SELECT INTERVAL '07:17:10' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
    SELECT INTERVAL '07:17:11' HOUR TO SECOND, 'Vehicle Load' FROM DUAL UNION ALL
    SELECT INTERVAL '07:21:55' HOUR TO SECOND, 'Repack'       FROM DUAL;
    

    All output:

    TIME TASK
    +00 06:06:41.000000 Repack
    +00 06:06:44.000000 Vehicle Load
    +00 06:10:47.000000 Repack
    +00 06:10:48.000000 Vehicle Load
    +00 06:15:45.000000 Repack
    +00 06:24:34.000000 Vehicle Load
    +00 06:26:11.000000 Repack
    +00 06:26:12.000000 Vehicle Load
    +00 06:29:33.000000 Relocate
    +00 06:46:07.000000 Repack
    +00 07:01:54.000000 Pick
    +00 07:04:26.000000 Marshal
    +00 07:06:28.000000 Pick
    +00 07:17:06.000000 Repack
    +00 07:17:10.000000 Vehicle Load
    +00 07:21:55.000000 Repack

    fiddle