Search code examples
oracle-databasesequencepartitionrow-number

ROW_NUMBER over PARTITION BY restart row counter between breaks


I have a list of activities that is currently ordered by user, date and time of activity, and ID. I want to generate numbers for each group set by those same fields. Using the following code, I achieve considerable accuracy. However, there's a problem when the same ID is repeated at a later time and I need the row number count to restart instead of continuing from the previous iteration.

Here's my code:

ROW_NUMBER() OVER (PARTITION BY USER_ID, foc_id ORDER BY USER_ID, to_char(activity_date, 'MM/DD/YYYY HH24:MI:SS'), foc_id) seq_nbr

In the image below, we see that FOC_ID "A240" had activity around 2:20PM. Then FOC_ID "B410" had activity around 3:19PM, lastly the user returned to "A240" for additional activity around 3:20. Because there was activity between the first and second sequence of events of "A240," I need the row number (seq_nbr) to restart instead of continuing from the previous activity.

enter image description here


Solution

  • You can use MATCH_RECOGNIZE:

    SELECT user_id,
           activity_date,
           foc_id,
           ROW_NUMBER() OVER ( PARTITION BY user_id, mno ORDER BY activity_date ) AS seq_num
    FROM   table_name
    MATCH_RECOGNIZE (
      PARTITION BY user_id
      ORDER     BY activity_date
      MEASURES
        MATCH_NUMBER() AS mno
      ALL ROWS PER MATCH
      PATTERN ( same_foc_id* last_row  )
      DEFINE
        same_foc_id AS FIRST( foc_id ) = NEXT( foc_id )
    )
    

    or, multiple ROW_NUMBERs:

    SELECT user_id,
           activity_date,
           foc_id,
           ROW_NUMBER() OVER ( PARTITION BY user_id, foc_id, grp ORDER BY activity_date ) AS seq_num
    FROM   (
      SELECT user_id,
             activity_date,
             foc_id,
             ROW_NUMBER() OVER ( PARTITION BY user_id ORDER BY activity_date )
               - ROW_NUMBER() OVER ( PARTITION BY user_id, foc_id ORDER BY activity_date ) AS grp
      FROM   table_name
    )
    ORDER BY user_id, activity_date
    

    Which, for the sample data:

    CREATE TABLE table_name ( user_id, activity_date, foc_id ) AS
    SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:20:34' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
    SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
    SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
    SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
    SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:39' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
    SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:44' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
    SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:58' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
    SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:20:11' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
    SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:22:16' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
    SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:22:33' HOUR TO SECOND, 'A240' FROM DUAL;
    

    Both output:

    USER_ID | ACTIVITY_DATE       | FOC_ID | SEQ_NUM
    :------ | :------------------ | :----- | ------:
    UVAC3   | 2020-11-04 14:20:34 | A240   |       1
    UVAC3   | 2020-11-04 14:21:23 | A240   |       2
    UVAC3   | 2020-11-04 14:21:23 | A240   |       3
    UVAC3   | 2020-11-04 14:21:23 | A240   |       4
    UVAC3   | 2020-11-04 15:19:39 | B410   |       1
    UVAC3   | 2020-11-04 15:19:44 | B410   |       2
    UVAC3   | 2020-11-04 15:19:58 | B410   |       3
    UVAC3   | 2020-11-04 15:20:11 | B410   |       4
    UVAC3   | 2020-11-04 15:22:16 | A240   |       1
    UVAC3   | 2020-11-04 15:22:33 | A240   |       2
    

    db<>fiddle here