sql postgresql gaps-and-islands dense-rank

Numbering rows based on multiple fields changes (incluging an "invisible" one) in PostgreSQL

I had a look at the previous topics, but I cannot achieve what I want.

I have a table like this :

id   status     update_date
---  ---        ---
A    PENDING    2020-11-01
A    PENDING    2020-11-02
A    CONFIRMED  2020-11-03
A    CONFIRMED  2020-11-04
A    CONFIRMED  2020-11-05
A    PENDING    2020-11-06
A    PAID       2020-11-07
B    CONFIRMED  2020-11-02
etc.

and I want to have this :

id   status     rank
---  ---        ---
A    PENDING    1
A    CONFIRMED  2
A    PENDING    3
A    PAID       4
B    CONFIRMED  1
etc.

meaning taking into account the update_date (and of course the status change) to sort and number the rows, but NOT having the order date in the final result

PS: as you can see, I can go back and forth from one status to the other ( PENDING -> CONFIRMED -> PENDING -> etc.) multiple times

Thanks lot !

Solution

You can address this as a gaps-and-island problem. The difference between row numbers gives you the group each record belongs to, that you can then use to aggregate:

select id, status, 
    row_number() over(partition by id order by min(update_date)) as rn
from (
    select t.*,
        row_number() over(partition by id order by update_date) rn1,
        row_number() over(partition by id, status order by update_date) rn2
    from mytable t
) t
group by id, status, rn1 - rn2
order by id, min(update_date)

Demo on DB Fiddle:

id | status    | rn
:- | :-------- | -:
A  | PENDING   |  1
A  | CONFIRMED |  2
A  | PENDING   |  3
A  | PAID      |  4
B  | CONFIRMED |  1