I have a table (visibility_history
) that tracks the dates when a user changes their profile's visibility. We create 1 visibility_history
row when a user profile is first created and additional rows whenever a user changes their profile visibility.
It looks approximately like
user_id | visible_before | visible_after | date |
---|---|---|---|
1 | nil | true | Jan 1, 2021 |
1 | true | false | Jan 4, 2021 |
1 | false | true | Jan 5, 2021 |
2 | nil | false | Jan 4, 2021 |
3 | nil | true | Jan 4, 2021 |
In the above example, user 1 was created on Jan 1, and user 2 and 3 were created on Jan 4. For each row representing a change in a user's visibility, we record both the visibility of their profile before the change and after the change.
I want to a query that will return the set of dates where each user's profile was visible. For example, I would make a query to return the dates where a user's profile is visible between Jan 2 and Jan 6. The results would be
user_id | date |
---|---|
1 | Jan 2, 2021 |
1 | Jan 3, 2021 |
1 | Jan 5, 2021 |
1 | Jan 6, 2021 |
3 | Jan 4, 2021 |
3 | Jan 5, 2021 |
3 | Jan 6, 2021 |
I believe that I need to use a tally table but I don't know how in this case.
You need to define bounds of your interest since it cannot be determined from your sample data you want Jan 6 but not Jan 7. I used calendar
CTE in my solution, anyway you can hardcode date constants if you want.
Then you precompute upper (inclusive) bound of each interval of visibility (I assume there are no rows with visible_before = visible_after
) and join pregenerated dates with valid interval for each user based on matching appropriate date range.
with visibility_history (user_id,visible_before,visible_after,date) as (values
(1,null ,true , date '2021-01-01'),
(1,true ,false, date '2021-01-04'),
(1,false,true , date '2021-01-05'),
(2,null ,false, date '2021-01-04'),
(3,null ,true , date '2021-01-04')
), calendar (min_date, max_date) as (values
(date '2021-01-02', date '2021-01-06')
), precomputed_interval (user_id, visible_after, since, till) as (
select h.user_id, h.visible_after, h.date
, coalesce(lead(h.date) over (partition by h.user_id order by h.date) - interval '1 day', calendar.max_date)
from visibility_history h
cross join calendar
)
select i.user_id, s.d
from generate_series((select min_date from calendar), (select max_date from calendar), '1 day'::interval) as s(d)
join precomputed_interval i on s.d between i.since and i.till
where i.visible_after
order by i.user_id, s.d