Sorry if this has been asked, but I didn't find anything when searching: I have a large table of ~100k rows in SQL Server. Within each row is a date range, which in many cases spreads across multiple months (and years to a lesser extent). The ranges are typically about 30-35 days however they usually don't start at the the 1st of the month. An example of a typical date range is 01/10/2017-02/11/2017.
I'm looking for the most efficient way to output the month with the most days within in that range as it's own column. I'm doing the same thing for the year
Right now I have the following in my query:
SELECT DISTINCT
a.START_DATE,
a.END_DATE,
cast(month(dateadd(day, datediff(day, a.Start_Date, a.End_Date)/2, a.Start_Date)) as tinyint) as Main_Month,
cast(year(dateadd(day, datediff(day, a.Start_Date, a.End_Date)/2, a.Start_Date)) as smallint) as Main_Year
FROM TABLE
The output from that query using the above date range example would give me:
Start_Date: 01/10/2017 End_Date: 02/11/2017 Main_Month: 1 Main Year: 2017
That method has worked alright, but it slows when being done for all rows in the table. Are there any more efficient alternatives that I can use for the Main_Month and Main_Year columns?
EDIT IN RESPONSE TO COMMENTS:
As discused in the comments above, this solution is not perfectly accurate, but it mirrors the accuracy of the original slower solution:
SELECT b.START_DATE, b.END_DATE, month(b.mid_point) as Main_Month, year(b.midpoint) as Main_Year FROM
(SELECT DISTINCT
a.START_DATE,
a.END_DATE,
dateadd(day, datediff(day, a.Start_Date, a.End_Date)/2, a.Start_Date) as mid_Point
FROM a) as b
You should be able to speed it up by making these two changes. First, only compute the datediff and dateadd once, then take it from the derived table to get the two fields you need. Next, don't bother with casting, since Month() and year() both do that for you. Were you able to see a speed difference with this method?