My table of interview candidates has three columns and looks like this (attempt is what I want to calculate):
candidate_id | interview_stage | stage_reached_at | attempt <- want to calculate |
---|---|---|---|
1 | 1 | 2019-01-01 | 1 |
1 | 2 | 2019-01-02 | 1 |
1 | 3 | 2019-01-03 | 1 |
1 | 1 | 2019-11-01 | 2 |
1 | 2 | 2019-11-02 | 2 |
1 | 1 | 2021-01-01 | 3 |
1 | 2 | 2021-01-02 | 3 |
1 | 3 | 2021-01-03 | 3 |
1 | 4 | 2021-01-04 | 3 |
The table represents candidate_id 1 who has had 3 separate interview attempts at a company.
Question: Can I somehow use the number series if I order by stage_reached_at? As soon as the next step for a particular candidate_id is lower than the row before, I know it's a new process.
I want to be able to group on candidate_id and process_grouping at the end of the day.
Thx in advance.
You can use lag()
and then a cumulative sum:
select t.*,
sum(case when prev_interview_stage >= interview_stage then 1 else 0 end) over (partition by candidate_id order by stage_reached_at) as attempt
from (select t.*,
lag(interview_stage) over (partition by candidate_id order by stage_reached_at) as prev_interview_stage
from t
) t;
Note: Your question specifically says "lower". I wonder, though, if you really mean "lower or equal to". If the latter, change the >=
to >
.