I'm working on snowflake to solve a problem. I wanted to find the number of events for the first 24 hours for each user id.
This is a snippet of the database table I'm working on. I modified the table and used a date format without the time for simplification purposes.
user_id | client_event_time |
---|---|
1 | 2022-07-28 |
1 | 2022-07-29 |
1 | 2022-08-21 |
2 | 2022-07-29 |
2 | 2022-07-30 |
2 | 2022-08-03 |
I used the following approach to find the minimum event time per user_id.
SELECT user_id, client_event_time,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY client_event_time) row_number,
MIN(client_event_time) OVER (PARTITION BY user_id) MinEventTime
FROM Data
ORDER BY user_id, client_event_time;
user_id | client_event_time | row_number | MinEventTime |
---|---|---|---|
1 | 2022-07-28 | 1 | 2022-07-28 |
1 | 2022-07-29 | 2 | 2022-07-28 |
1 | 2022-08-21 | 3 | 2022-07-28 |
2 | 2022-07-29 | 1 | 2022-07-29 |
2 | 2022-07-30 | 2 | 2022-07-29 |
2 | 2022-08-03 | 3 | 2022-07-29 |
Then I tried to find the difference between the minimum event time and client_event_time, and if the difference is less than or equal to 24, I counted the client_event_time.
with NewTable as (
(SELECT user_id,client_event_time, event_type,
row_number() over (partition by user_id order by CLIENT_EVENT_TIME) row_number,
MIN(client_event_time) OVER (PARTITION BY user_id) MinEventTime
FROM Data
ORDER BY user_id, client_event_time))
SELECT user_id,
COUNT(case when timestampdiff(hh, client_event_time, MinEventTime) <= 24 then 1 else 0 end) AS duration
FROM NEWTABLE
GROUP BY user_id
I got the following result:
user_id | duration |
---|---|
1 | 3 |
2 | 3 |
I wanted to find the following result:
user_id | duration |
---|---|
1 | 2 |
2 | 2 |
Could you please help me solve this problem? Thanks!
Easier done with a qualify
with cte as
(select *
from mytable
qualify event_time<=min(event_time) over (partition by user_id) + interval '24 hours')
select user_id, count(*) as counts
from cte
group by user_id