I have a table "days_users" with dates and users (one row per date and user), from where I could extract the information of which particular days a user was seen, or which users were seen on a particular day:
2023-01-01,user1
2023-01-01,user2
2023-01-01,user3
2023-01-02,user2
2023-01-02,user4
2023-01-03,user1
2023-01-03,user4
I need to compute how many new users appear/disappear each day, and I don't know how to do it. A user "appear" on one day means that the user is seen on that day, but was never seen before, and a user "disappear" on one day means that the user was seen on that day but was never seen after that date.
I thought that a way to start is to create a view of users, first_date_seen, last_date_seen as follows:
user_first_last AS (
SELECT user, min(date) AS first_date_seen, max(date) AS last_date_seen FROM days_users
GROUP BY 1
)
And then count all the users that appear from a particular date until the end
SELECT date, COUNT(DISTINCT user) as num_appearing_users
FROM user_first_last WHERE first_date_seen = {date} AND last_date_seen = '2023-03-01'
GROUP BY 1
ORDER BY 1
And similarly for the disappearing users
SELECT date, COUNT(DISTINCT user) as num_disappearing_users
FROM user_first_last WHERE first_date_seen = '2023-01-01' AND last_date_seen = {date}
GROUP BY 1
ORDER BY 1
But please note the {date}
between curly braces: I would like this date to be the same as the date in the query, i. e. both dates in bold should be the same:
SELECT
**date**, COUNT(DISTINCT user) AS num_disappearing_users
FROM user_first_last
WHERE first_date_seen = '2023-01-01'
AND last_date_seen = **date**
How can I achieve this?
You can do this:
with dates as(
select distinct date from days_users),
usg as (
select min(date) first_date,
max(date) last_date,
user
from days_users
GROUP BY user
)
select date,
(select count(user)
from usg
where usg.last_date=dates.date
) never_seen_after,
(select count(user)
from usg
where usg.first_date=dates.date
) never_seen_before
from dates
Here we use CTEs:
dates
- to get list of dates present in data,usg
- list of users with their first and last day.