I am trying to use the min() value of a timestamp as a starting point and then group data by 30 day intervals in order to get a count of occurrences for each unique value within the timestamp date range as columns
i have two tables that i am joining together to get a count. Table 1 (page_creation) has 2 columns labeled link and dt_crtd. Table 2(page visits) has 2 other columns labeled url and date. the tables are being joined by joining table1.link = table2.pagevisits.
After the join i get a table similar to this:
+-------------------+------------------------+
| url | date |
+-------------------+------------------------+
| www.google.com | 2018-01-01 00:00:00' |
| www.google.com | 2018-01-02 00:00:00' |
| www.google.com | 2018-02-01 00:00:00' |
| www.google.com | 2018-02-05 00:00:00' |
| www.google.com | 2018-03-04 00:00:00' |
| www.facebook.com | 2014-01-05 00:00:00' |
| www.facebook.com | 2014-01-07 00:00:00' |
| www.facebook.com | 2014-04-02 00:00:00' |
| www.facebook.com | 2014-04-10 00:00:00' |
| www.facebook.com | 2014-04-11 00:00:00' |
| www.facebook.com | 2014-05-01 00:00:00' |
| www.twitter.com | 2016-02-01 00:00:00' |
| www.twitter.com | 2016-03-04 00:00:00' |
+---------------------+----------------------+
what i am trying to get is results that pull this :
+-------------------+------------------------+------------+------------+-------------+
| url | MIN_Date | Interval 1 | Interval 2| Interval 3 |
+-------------------+------------------------+-------------+-----------+-------------+
| www.google.com | 2018-01-01 00:00:00' | 2 | 2 | 1
| www.facebook.com | 2014-01-05 00:00:00' | 2 | 0 | 1
| www.twitter.com | 2016-02-01 00:00:00' | 1 | 1 | 0
+---------------------+----------------------+-------------+-----------+-------------+
So the 30 day intervals begin from the min(date) as shown in Interval 1 and are counted every 30 days.
Ive looked at other questions such as :
Group rows by 7 days interval starting from a certain date
MySQL query to select min datetime grouped by 30 day intervals
However it did not seem to answer my specific problem.
Ive also looked into pivot syntax but noticed it is only supported for certain DBMS.
Any help would be greatly appreciated.
Thank you.
If you are using BigQuery, I would recommend:
countif()
to count a boolean valuetimestamp_add()
to add intervals to timestampsThe exact boundaries are a bit vague, but I would go for:
select pc.url,
countif(pv.date >= pc.dt_crtd and
pv.date < timestamp_add(pc.dt_crtd, interval 30 day
) as Interval_00_29,
countif(pv.date >= timestamp_add(pc.dt_crtd, interval 30 day) and
pv.date < timestamp_add(pc.dt_crtd, interval 60 day
) as Interval_30_59,
countif(pv.date >= timestamp_add(pc.dt_crtd, interval 60 day) and
pv.date < timestamp_add(pc.dt_crtd, interval 90 day
) as Interval_60_89
from page_creation pc join
page_visits pv
on pc.link = pv.url
group by pc.url