I have a table named MyDates with columns Id (INT) and MyDate (DATE). MyDates can contain several rows with the same date.
I want to count the number of consecutive workdays, starting from today and going backwards in time, where a workday is defined as any weekday from Monday to Friday.
I have tried to write a query using cte (common table expression), but I am stuck
Lets say that today is 27. April 2023 The MyDates table contains the following seven rows:
1, 2023-04-17
2, 2023-04-21
3, 2023-04-24
4, 2023-04-24
5, 2023-04-25
6, 2023-04-26
7, 2023-04-27
I want the query to return Streak = 5
That is because the streak should count the four dates from Thursday 2023-14-27 to Monday 2023-04-24, plus Friday 2023-04-21.
The query should ignore Saturday 2023-04-22 and Sunday 2023-04-23, and it should ignore the fact that there are to rows for Monday 2023-04-24
SELECT MIN(date) AS start_date, MAX(date) AS end_date, COUNT(*) AS streak_length
FROM (
SELECT date, value, ROW_NUMBER() OVER (ORDER BY date) AS rn,
ROW_NUMBER() OVER (PARTITION BY value >= threshold, is_weekend ORDER BY date) AS grp
FROM (
SELECT date, value,
CASE WHEN DAYOFWEEK(date) IN (1, 7) THEN 1 ELSE 0 END AS is_weekend
FROM table_name
) sub
) sub
WHERE value >= threshold
GROUP BY value, grp - rn
ORDER BY start_date;
Here is a step-by-step approach at this gaps-and-islands problem.
We would start by deduplicating the dates, and associating the "previous" date with lag()
; just in case, we can also exclude dates that fall on week-ends, if any:
select date, lag(date) over(order by date) lag_date
from mytable
where datepart(weekday, date) not in (1, 7)
group by date
From there on, we can use a conditional sum to define groups of "adjacent" records (aka islands) ; the trick is to adjust the date arithmetic when we meet a week-end:
select t.*,
sum(
case when date = dateadd(
day,
case when datepart(weekday, lag_date) = 6 then 3 else 1 end,
lag_date
) then 0 else 1 end) over(order by date desc) grp
from (
select date, lag(date) over(order by date) lag_date
from mytable
group by date
) t
The final step is to retain the first group only (that corresponds to the most recent streak):
select min(lag_date) streak_start, max(date) streak_end,
count(*) + 1 streak_length
from (
select t.*,
sum(
case when date = dateadd(
day,
case when datepart(weekday, lag_date) = 6 then 3 else 1 end,
lag_date
) then 0 else 1 end) over(order by date desc) grp
from (
select date, lag(date) over(order by date) lag_date
from mytable
where datepart(weekday, date) not in (1, 7)
group by date
) t
) t
where grp = 0
streak_start | streak_end | streak_length |
---|---|---|
2023-04-21 | 2023-04-27 | 5 |
Important note: dayofweek
, that is used in your original query, is not a thing in SQL Server. We use datepart(weekday, ...)
instead. However please note that the result of this function depend on how you set the datefirst
parameter. The answer here assumes set datefirst 7
(meaning that dates weeks start on Sunday, as implied by your original code).