I have 3 columns:
Employee ID(numerical)
Day of work(a date yyyy-mm-dd when employee had a shift)
is_consecutive_work_day (1 if days of work are consecutive, else 0)
I need a 4th: Consecutive_work_days (a cumulative sum of is_consecutive_work_day, which resets to 1 when is_consecutive_work_day = 0). So this will go to a maximum of 5 for any employee id. Some will have 1,2,3 others 1,2...etc.
What am failing to figure out is how to write the 4th column (consecutive_work_days). Not how to write a consecutive sum per employee id, but specifically how to reset to 1 when is_consecutive_work_day = 0 per employee id.
May I ask for your help regarding this 4th column please? Thanks.
You can use window functions. lag()
lets you access the previous day_of_work
for the same employee, which you can compare to the current day_of_work
: if there is a one day difference, then you can set is_consecutive_work_day
to 1.
select
employee_id,
day_of_work,
case
when day_of_work
= lag(day_of_work) over(partition by employee_id order by day_of_work)
+ interval 1 day
then 1
else 0
end is_consecutive_work_day
from mytable
To compute the cumulative sum, it is a bit more complicated. We can use some gaps-and-island technique to put each record in the group it belongs to: basically, everytime is_consecutive_work_day
of 0
is met, a new group starts; we can then do a window sum()
over each group:
select
employee_id,
day_of_work,
is_consecutive_work_day,
sum(is_consecutive_work_day)
over(partition by employee_id, grp order by day_of_work)
consecutive_work_days
from (
select
t.*,
sum(1 - is_consecutive_work_day) over(partition by employee_id order by day_of_work) grp
from (
select
t.*,
case
when day_of_work
= lag(day_of_work) over(partition by employee_id order by day_of_work)
+ interval 1 day
then 1
else 0
end is_consecutive_work_day
from mytable t
) t
) t