[Query and question edited and fixed thanks to comments from @Gordon Linoff and @shawnt00]
I recently inherited a SQL query that calculates the number of some events in time windows of 30 days from a log database. It uses a CTE (Common Table Expression) to generate the 30 days ranges since '2019-01-01' to now. And then it counts the cases in those 30/60/90 days intervals. I am not sure this is the best method. All I know is that it takes a long time to run and I do not understand 100% how exactly it works. So I am trying to rebuild it in an efficient way (maybe as it is now is the most efficient way, I do not know).
I have several questions:
Note: LogDate original format is like 2019-04-01 18:30:12.000.
DECLARE @dt1 Datetime='2019-01-01'
DECLARE @dt2 Datetime=getDate();
WITH ctedaterange
AS (SELECT [Dates]=@dt1
UNION ALL
SELECT [dates] + 30
FROM ctedaterange
WHERE [dates] + 30<= @dt2)
SELECT
[dates],
lt.Activity, COUNT(*) as Total,
SUM(CASE WHEN lt.LogDate <= dates and lt.LogDate > dates - 90 THEN 1 ELSE 0 END) AS Activity90days,
SUM(CASE WHEN lt.LogDate <= dates and lt.LogDate > dates - 60 THEN 1 ELSE 0 END) AS Activity60days,
SUM(CASE WHEN lt.LogDate <= dates and lt.LogDate > dates - 30 THEN 1 ELSE 0 END) AS Activity30days
FROM ctedaterange AS cte
JOIN (SELECT Activity, CONVERT(DATE, LogDate) as LogDate FROM LogTable) AS lt
ON cte.[dates] = lt.LogDate
group by [dates], lt.Activity
OPTION (maxrecursion 0)
Sample dataset (LogTable):
LogDate, Activity
2020-02-25 01:10:10.000, Activity01
2020-04-14 01:12:10.000, Activity02
2020-08-18 02:03:53.000, Activity02
2019-10-29 12:25:55.000, Activity01
2019-12-24 18:11:11.000, Activity03
2019-04-02 03:33:09.000, Activity01
Expected Output (the output does not reflect the data shown above for I would need too many lines in the sample set to be shown in this post)
As I said above, the bottom line is: I need to aggregate data by number of occurrences in time periods of 30, 60 and 90 days.
Activity, Activity90days, Activity60days, Activity30days
Activity01, 3, 0, 1
Activity02, 1, 10, 2
Activity03, 5, 1, 3
Thank you for any suggestion.
SQL Server doesn't yet have the option to range over values of the window frame of an analytic function. Since you've generated all possible dates though and you've already got the counts by date, it's very easy to look back a specific number of (aggregated) rows to get the right totals. Here is my suggested expression for 90 days:
sum(count(LogDate)) over (
partition by Activity order by [dates]
with rows between 89 preceding and current row
)