I have a table consisting of actions that are logged when users are viewing files. When a user views a file, this can result in multiple actions over time. As multiple users can be active at the same time, their actions can become intertwined.
I am looking to create an additional column in my query that tells me when a user started viewing this file (let's call this StartId).
When a user stops viewing a file to view some other file, and then goes back to viewing the first file, this should be regarded as a new viewing session.
This table illustrates my problem and my desired result:
Id | User | File | StartId (desired result) |
---|---|---|---|
1 | A | X | 1 |
2 | A | X | 1 |
3 | B | Y | 3 |
4 | A | X | 1 |
5 | B | Y | 3 |
6 | A | Y | 6 |
7 | A | X | 7 |
The closest I have come is with this line:
StartId = FIRST_VALUE(Id) OVER (PARTITION BY User, File ORDER BY Id)
However, this has the following result for the last action in the example:
Id | User | File | StartId |
---|---|---|---|
7 | A | X | 1 |
Can someone point me in the right direction with this?
you can use LAG
to see if the previous file for the same user was the same as the current file for that user and so whether we are in a new session or not and then use that result accordingly.
WITH T AS
(
SELECT *,
CASE WHEN "File" = LAG("File") OVER (PARTITION BY "User" ORDER BY "Id") THEN NULL ELSE "Id" END AS NewSessionFlag
FROM YourTable
)
SELECT *,
MAX(NewSessionFlag) OVER (PARTITION BY "User" ORDER BY "Id" ROWS UNBOUNDED PRECEDING)
FROM T
ORDER BY "Id"