I'm not sure how to code to get the result I'm looking for. I hope the community can point me in the right direction. I need to run a report against the Rent History table, and I expect the report to look like the Final Result. Please see the image for details.
There are three tags in the Rent History table. I was able to use ROW_NUMBER() OVER (PARTITION BY) to determine the On/Off Rent start date based on the position. This is where I stopped because I'm not sure what to do next. So far, this is the code that I have.
SELECT tag
,progress_complete_date_time
,create_date
,required_until_date
,modify_date
,on_rent_status
,[date_on-off_rent]
,position
,CASE
WHEN position = 1
THEN [date_on-off_rent]
END AS [first_time_on-off_rent]
,RANK() OVER(PARTITION BY tag ORDER BY modify_date) AS [RANK]
FROM CleanUpB
The DateOnRent and DateOffRent columns are based on how the tags start out. Some tags start with DateOnRent and some with DateOffRent. I only need to capture the start and end status of on_rent_status. Here is my formula to calculate DaysOnRent and DaysOffRent.
TAG-000479 DaysOnRent DATEDIFF(DAY,'8/12/2021','4/4/2024') = 966 DATEDIFF(DAY,'04/04/2024','06/10/2024') = 67
TAG-000479 DaysOffRent DATEDIFF(DAY,'04/04/2024','04/04/2024') = 0 DATEDIFF(DAY,'06/10/2024',GETDATE()) = 38
TAG-001832 DaysOnRent DATEDIFF(DAY,'03/28/2024',GETDATE()) = 112
TAG-003420 DaysOnRent DATEDIFF(DAY,'06/24/2024',GETDATE()) = 24
Based on feedback from OP. This is not just a simple min/max situation, it's a gaps and islands problem. In this case, we have to deal with multiple "on_rent" records and potentially multiple "off_rent" records as well as gaps in between "off_rent" and "on_rent".
This is my solution so far (see bottom of response to populate sample data table):
WITH cte AS (
-- Collapse groups to get the start and end rent dates for each
SELECT x.tag, x.GroupID
, DateOnRent = MIN(IIF(x.on_rent_status = 1, x.[date_on-off_rent], NULL)) -- Using min because we want the FIRST time on_rent was logged
, DateOffRent = MIN(IIF(x.on_rent_status = 0, x.[date_on-off_rent], NULL)) -- Using min because we want the FIRST time off_rent was logged
FROM (
SELECT x.tag, x.[date_on-off_rent], x.on_rent_status
-- If the status changed from off_rent to on_rent, then we know a new group has started
-- Returning 1 to the running total will generate a grouping ID that keeps sequential rent_status records together
, GroupID = SUM(IIF(x.on_rent_status = 1 AND x.prev_on_rent_status = 0, 1, 0)) OVER (PARTITION BY x.tag ORDER BY x.position)
FROM (
SELECT tag, on_rent_status, [date_on-off_rent], position
-- Getting the previous value will help us later determine whether the status changed
, prev_on_rent_status = LAG(on_rent_status) OVER (PARTITION BY tag ORDER BY position)
FROM #rent_history
) x
) x
GROUP BY x.tag, x.GroupID
)
SELECT x.tag
, FirstDateOnRent = MIN(x.DateOnRent)
, LastDateOffRent = MAX(x.LastDateOffRent) -- NULL if still on_rent
, DaysOnRent = SUM(x.DaysOnRent)
, DaysOffRent = DATEDIFF(DAY, MAX(x.LastDateOffRent), GETDATE()) -- NULL if still on_rent
FROM (
SELECT x.tag, x.DateOnRent, x.DateOffRent
-- We want the most recent state of DateOffRent, rather than MAX, which means if they are still on_rent, it should return null
, LastDateOffRent = LAST_VALUE(x.DateOffRent) OVER (PARTITION BY x.tag ORDER BY x.DateOnRent)
-- Number of DaysOnrent needs to be calc'd at the group level so we don't accidentally include gaps
, DaysOnRent = DATEDIFF(DAY, x.DateOnRent, COALESCE(x.DateOffRent, GETDATE()))
FROM cte x
) x
GROUP BY x.tag;
I tried to comment the code as best I could, so I won't completely dive into how this works line by line.
But the basic idea is...calculate some sort of grouping ID that can keep sequential events together. In this case I relied on both LAG()
and SUM({state change logic}) OVER()
to generate that ID.
Then from there it's just a lot of extra grouping and aggregation to get the final output.
This new query version will support multiple on_rent
records, multiple off_rent
records as well as gaps between off_rent
and on_rent
.
Based on your explanation, you have a table/dataset called "Rent History", which represents the top dataset in your screenshot.
And you want a query against that dataset whose output matches the second dataset in your screenshot labeled "Final Result".
Assuming that's true, this is what I've come up with...problem is, I'm making a bunch of assumptions around business logic that may not be correct, even if it does match your output table...
SELECT x.tag, x.DateOnRent, x.DateOffRent
, DaysOnRent = DATEDIFF(DAY, x.DateOnRent, COALESCE(x.DateOffRent, GETDATE()))
, DaysOffRent = DATEDIFF(DAY, x.DateOffRent, GETDATE())
FROM (
SELECT tag
, DateOnRent = MIN(IIF(on_rent_status = 1, [date_on-off_rent], NULL))
, DateOffRent = MAX(IIF(on_rent_status = 0, [date_on-off_rent], NULL))
FROM #rent_history
GROUP BY tag
) x
I also don't know how large the underlying datasets are this is running against, so I have no idea how efficient it would be.
The concept seems pretty simple though...grab the earliest "on_rent" date and the oldest "off_rent" date. Then do the datediff calculations using those?
For anyone who wants the script I used to generate the sample data...
CREATE TABLE #rent_history (
tag nvarchar(20) NOT NULL,
progress_complete_date_time datetime NOT NULL,
create_date datetime NOT NULL,
required_until_date datetime NOT NULL,
modify_date datetime NOT NULL,
on_rent_status bit NOT NULL,
[date_on-off_rent] datetime NOT NULL,
position int NOT NULL,
[first_time_on-off_rent] datetime NULL,
[RANK] int NOT NULL,
);
INSERT INTO #rent_history
VALUES ('TAG-000479', '08/12/21 00:00', '11/06/23 19:31', '01/31/24 00:00', '11/06/23 19:31', 1, '08/12/21 00:00', 1, '08/12/21 00:00', 1)
, ('TAG-000479', '08/12/21 00:00', '11/06/23 19:31', '01/31/24 00:00', '11/06/23 19:31', 1, '11/06/23 00:00', 2, NULL , 1)
, ('TAG-000479', '08/12/21 00:00', '11/06/23 19:31', '01/31/24 00:00', '11/06/23 19:31', 1, '11/06/23 00:00', 3, NULL , 1)
, ('TAG-000479', '08/12/21 00:00', '11/06/23 19:31', '04/01/24 00:00', '04/04/24 05:00', 0, '04/04/24 00:00', 4, NULL , 4)
, ('TAG-000479', '08/12/21 00:00', '11/06/23 19:31', '04/01/24 00:00', '04/04/24 05:00', 1, '04/04/24 00:00', 5, NULL , 5)
, ('TAG-000479', '08/12/21 00:00', '11/06/23 19:31', '12/31/24 00:00', '04/30/24 15:11', 1, '04/30/24 00:00', 6, NULL , 6)
, ('TAG-000479', '08/12/21 00:00', '11/06/23 19:31', '12/31/24 00:00', '06/10/24 17:02', 0, '06/10/24 00:00', 7, NULL , 7)
, ('TAG-001832', '03/28/24 00:00', '01/17/24 15:41', '03/31/24 00:00', '03/28/24 03:32', 1, '03/28/24 00:00', 1, '03/28/24 00:00', 1)
, ('TAG-003420', '06/24/24 00:00', '06/24/24 18:55', '12/15/24 00:00', '06/25/24 15:18', 1, '06/24/24 00:00', 1, '06/24/24 00:00', 1);