I am trying to to create a rank for each instance of a status occurring, for example
ID | Status | From_date | To_date | rank |
---|---|---|---|---|
1 | Available | 2022-01-01 | 2022-01-02 | 1 |
1 | Available | 2022-01-02 | 2022-01-03 | 1 |
1 | Unavailable | 2022-01-03 | 2022-01-10 | 2 |
1 | Available | 2022-01-10 | 2022-01-20 | 3 |
For each ID
, for each instance of a status
occurring, by from_date
ascending.
I want to do this as i see this as the best way of getting to the final result i want which is
ID | Status | From_date | To_date | rank |
---|---|---|---|---|
1 | Available | 2022-01-01 | 2022-01-03 | 1 |
1 | Unavailable | 2022-01-03 | 2022-01-10 | 2 |
1 | Available | 2022-01-10 | 2022-01-20 | 3 |
I tried dense_rank(partition by id order by status, from_date
but can see now why that wouldnt work. Not sure how to get to this result.
So with this CTE for the data:
with data(ID, Status, From_date, To_date) as (
select * from values
(1, 'Available', '2022-01-01', '2022-01-02'),
(1, 'Available', '2022-01-02', '2022-01-03'),
(1, 'Unavailable', '2022-01-03', '2022-01-10'),
(1, 'Available', '2022-01-10', '2022-01-20')
)
the first result, being rank can be done with CONDITIONAL_CHANGE_EVENT:
select *
,CONDITIONAL_CHANGE_EVENT( Status ) OVER ( PARTITION BY ID ORDER BY From_date ) as rank
from data;
ID | STATUS | FROM_DATE | TO_DATE | RANK |
---|---|---|---|---|
1 | Available | 2022-01-01 | 2022-01-02 | 0 |
1 | Available | 2022-01-02 | 2022-01-03 | 0 |
1 | Unavailable | 2022-01-03 | 2022-01-10 | 1 |
1 | Available | 2022-01-10 | 2022-01-20 | 2 |
and thus the keeps the first of each rank can be achieved with a QUALIFY/ROW_NUMBER, because the CONDITIONAL_CHANGE is a complex operation, needs wrapping in a sub-select, so the answer is not as short as I would like:
select * from (
select *
,CONDITIONAL_CHANGE_EVENT( Status ) OVER ( PARTITION BY ID ORDER BY From_date ) as rank
from data
)
qualify row_number() over(partition by id, rank ORDER BY From_date ) = 1
gives:
ID | STATUS | FROM_DATE | TO_DATE | RANK |
---|---|---|---|---|
1 | Available | 2022-01-01 | 2022-01-02 | 0 |
1 | Unavailable | 2022-01-03 | 2022-01-10 | 1 |
1 | Available | 2022-01-10 | 2022-01-20 | 2 |
Also, the final result minus the ranking can be done with:
select *
from data
qualify nvl(Status <> lag(status) over ( PARTITION BY ID ORDER BY From_date ), true)
ID | STATUS | FROM_DATE | TO_DATE |
---|---|---|---|
1 | Available | 2022-01-01 | 2022-01-02 |
1 | Unavailable | 2022-01-03 | 2022-01-10 |
1 | Available | 2022-01-10 | 2022-01-20 |
and thus a rank can be added at the end
select *
,rank() over ( PARTITION BY ID ORDER BY From_date ) as rank
from (
select *
from data
qualify nvl(Status <> lag(status) over ( PARTITION BY ID ORDER BY From_date ), true)
)
ID | STATUS | FROM_DATE | TO_DATE | RANK |
---|---|---|---|---|
1 | Available | 2022-01-01 | 2022-01-02 | 1 |
1 | Unavailable | 2022-01-03 | 2022-01-10 | 2 |
1 | Available | 2022-01-10 | 2022-01-20 | 3 |