Search code examples
sqlsql-serverperformancedatelag

Custom month numbers that take last 30 days instead of Number of month (SQL Server)


I am trying to create a lag function to return current month and last month streams for an artist.

Instead of returning streams for Feb vs Jan, I wan the function to use the last 30 days as a period for current month, and the previous 30 days as the previous month.

The query that I am currently using is this:

SELECT
    DATEPART(month, date) AS month,
    artist,

    SUM([Streams]) AS streams,

    LAG(SUM([Streams])) OVER (PARTITION BY artist ORDER BY DATEPART(month, date)) AS previous_month_streams
FROM combined_artist
WHERE date > DATEADD(m, -2, DATEADD(DAY, 2 - DATEPART(WEEKDAY, GETDATE()-7), CAST(GETDATE()-7 AS DATE)))
GROUP BY DATEPART(month, date), artist;

While this works, it is not giving me the data I need. This is returning the sum of streams for February vs the Streams for the month of January. February seems very low because we only have one week worth of data in February.

My goal is to get the last 30 days from the max date in the table using a lag function. So if the max date is Feb. 7 2023, I want the current month to include data from Jan. 7 2023 - Feb. 7 2023, and the previous month to include data from Dec. 7 2022 - Jan. 7 2023. I am thinking to create a custom month date part that will start from the max date and give a month number to the last 30 days . (2 for Jan 7 - Feb 7, 1 for Dec 7 - Jan-7...) I am not sure how to go about this. This is in SQL Server and I am looking to use the lag function for performance reasons.


Solution

  • I think you could probably use something like datediff(d, date_you_care_about, max_date)/30 in your group by and partition by clauses.

    The basic idea is that integer division rounds down, so if the difference between the dates is < 30, dividing it by 30 is 0. If the difference is >=30 but less than 60, dividing it by 30 is 1. And so forth.

    You can see a proof of concept in this Fiddle.