I am having trouble figuring out how to write a window function that solves my problem. I am quite the novice at window functions, but I think one could be written to meet my needs.
Problem Statement:
I want to calculate a transfer sequence showing when person has changed locations based on the corresponding location ID over time.
Sample Data (Table1)
+----------+------------+-----------+---------+
| PersonID | LocationID | Date | Time |
+----------+------------+-----------+---------+
| 12 | A | 6/17/2020 | 12:00PM |
+----------+------------+-----------+---------+
| 12 | A | 6/18/2020 | 1:00PM |
+----------+------------+-----------+---------+
| 12 | B | 6/18/2020 | 6:00AM |
+----------+------------+-----------+---------+
| 12 | C | 6/19/2020 | 3:00PM |
+----------+------------+-----------+---------+
| 13 | A | 6/16/2020 | 8:00AM |
+----------+------------+-----------+---------+
| 13 | A | 6/16/2020 | 11:00AM |
+----------+------------+-----------+---------+
| 13 | A | 6/16/2020 | 12:00AM |
+----------+------------+-----------+---------+
| 13 | B | 6/16/2020 | 4:00PM |
+----------+------------+-----------+---------+
Expected Results
+----------+------------+-----------+---------+-------------------+
| PersonID | LocationID | Date | Time | Transfer Sequence |
+----------+------------+-----------+---------+-------------------+
| 12 | A | 6/17/2020 | 12:00PM | 1 |
+----------+------------+-----------+---------+-------------------+
| 12 | A | 6/18/2020 | 1:00PM | 1 |
+----------+------------+-----------+---------+-------------------+
| 12 | B | 6/18/2020 | 6:00AM | 2 |
+----------+------------+-----------+---------+-------------------+
| 12 | C | 6/19/2020 | 3:00PM | 3 |
+----------+------------+-----------+---------+-------------------+
| 13 | A | 6/16/2020 | 8:00AM | 1 |
+----------+------------+-----------+---------+-------------------+
| 13 | A | 6/16/2020 | 11:00AM | 1 |
+----------+------------+-----------+---------+-------------------+
| 13 | A | 6/16/2020 | 12:00AM | 1 |
+----------+------------+-----------+---------+-------------------+
| 13 | B | 6/16/2020 | 4:00PM | 2 |
+----------+------------+-----------+---------+-------------------+
What I Tried
SELECT
[t1].[PersonID]
,[t1].[LocationID]
,[t1].[Date]
,[t1].[Time]
,DENSE_RANK()
OVER(
partition BY [t1].[PersonID], [t1].[LocationID]
ORDER BY [t1].[Date] ASC, [t1].[Time] ASC) AS
[Transfer Sequence]
FROM Table1 [t1]
Unfortunately, I believe DENSE_RANK() is assigning a rank regardless of whether the value of LocationID has changed. I need a function that will only add one to the sequence when the LocationID has changed.
Any help would be greatly appreciated.
Thank you!
You want to put "adjacent" rows in the same group. Straigt window functions cannot do that for you - we would need to use a gaps-and-island technique:
select
t.*,
sum(case when locationID = lagLocationID then 0 else 1 end)
over(partition by personID order by date, time)
as transfert_sequence
from (
select
t.*,
lag(locationID)
over(partition by personID order by date, time)
as lagLocationID
from mytable t
) t
The idea is to compute a window sum that increments everytime the locationID changes.
Note that this would properly handle the case when a person comes back to a location they have already been before.