Search code examples
sqlwindow-functionsgaps-and-islands

Get maximum of sequence


+----+-------+
| id | value | 
+----+-------+
|  1 |    A  |
|  2 |    B  |
|  3 |    C  |
|  4 |    D  |
|  5 |    D  |
|  6 |    D  |
|  7 |    N  |
|  8 |    P  |
|  9 |    P  |
+----+-------+

Desired output

+----+-------+---------------------+
| id | value |      calc ↓         |
+----+-------+---------------------+
|  1 |    A  |          1          |
|  2 |    B  |          2          |
|  3 |    C  |          3          |
|  4 |    D  |          6          |
|  5 |    D  |          6          |
|  6 |    D  |          6          |
|  7 |    N  |          7          |
|  8 |    P  |          9          |
|  9 |    P  |          9          |
| 10 |    D  |          11         |
| 11 |    D  |          11         |
| 12 |    Z  |          12         |
+----+-------+---------------------+

Can you help me for a solution for this ? Id is identity, id must be present in output, must have the same 9 rows in output.

New note: I added rows 10,11,12. Notice that id 10 and 11 which has letter 'D' is in a different group from id 4,5,6

thanks


Solution

  • If the grouping also depends on the surrounding ids then this turns into something like the gaps and islands problem https://www.red-gate.com/simple-talk/sql/t-sql-programming/the-sql-of-gaps-and-islands-in-sequences/#:~:text=The%20SQL%20of%20Gaps%20and%20Islands%20in%20Sequences,...%204%20Performance%20Comparison%20of%20Gaps%20Solutions.%20

    You could use the Tabibitosan method https://rwijk.blogspot.com/2014/01/tabibitosan.html

    Here you also need to group by your value column but that doesn't complicate it too much:

    select id, value, max(id) over (partition by value, island) calc
    from (
    select id, value, id - row_number() over(partition by value order by id) island
    from my_table
    ) as sq
    order by id;
    

    The id - row_number() over(partition by value order by id) expression gives you a number which changes each time the ID value changes by more than 1 for each value of value. This gets included in the max(id) over (partition by value, island) expression. The island number is only valid for that particular value. In your case, both values N and D have a computed island number of 6 but they need to be considered differently.

    Db-fiddle https://www.db-fiddle.com/f/jahP7T6xBt3cpbLRhZZdQG/1