Aggregations of transactions between two timestamps by an ID in SQL

I'm trying to calculate velocity aggregations over different time windows for ids in a sql table. I have a table:

TABLE transactions (
  transaction_id int,
  customer_id int,
  date Timestamp,
  should_aggregate boolean,
)

and i'm trying to create a table with velocity features at the time of the transaction

TABLE transaction_velocity (
  transaction_id int,
  customer_id int,
  date Timestamp,
  previous_1hour int,
  previous_1day  int,
  previous_1week int
)

For example if we had:

transaction_id	customer_id	date	should_aggregate
1	1	2023-01-10 01:00:00 +0000	true
2	1	2023-01-10 00:55:00 +0000	false
3	1	2023-01-09 00:57:00 +0000	true
4	1	2023-01-07 00:57:00 +0000	false
5	2	2023-01-10 00:57:00 +0000	true

The resulting table would be:

transaction_id	customer_id	date	previous_1hour	previous_1day	previous_1week
1	1	2023-01-10 01:00:00 +0000	1	1	3
3	1	2023-01-09 00:57:00 +0000	0	0	1
5	2	2023-01-10 00:57:00 +0000	0	0	0

Essentially each row is aggregated on past transactions per customer over 3 time windows if it's marked for aggregation. I've considered using partition windowing, but can't figure out how to do that over timestamp.

Solution

As you indicated, you might be able to do this with some window functions, however I think I might just recommend a self join:

SELECT
 t1.transaction_id
 , t1.customer_id
 , t1.date
 , SUM(CASE WHEN t2.date > DATE_SUB(t1.date, INTERVAL 1 HOUR) THEN 1 ELSE 0 END) previous_1hour
 , SUM(CASE WHEN t2.date > DATE_SUB(t1.date, INTERVAL 1 DAY) THEN 1 ELSE 0 END) previous_1day
 , SUM(CASE WHEN t2.date > DATE_SUB(t1.date, INTERVAL 1 WEEK) THEN 1 ELSE 0 END) previous_1week
FROM transactions t1 LEFT OUTER JOIN transactions t2
  ON t1.customer_id = t2.customer_id AND t2.date < t1.date
WHERE t1.should_aggregate
GROUP BY t1.transaction_id, t1.customer_id, t1.date

This simply joins the records with the same customer_id and a smaller date, and the sums in the select count them if the date is large enough to be in the interval.

If someone has a simple way of doing this with window functions, sure, use that, but I think the above is pretty enough and easy to follow. (I just can't think of a good way of doing this with window functions, I'm not recommending against that.)

An alternative might be correlated subqueries in the select:

SELECT
  transaction_id
  , customer_id
  , date
  , (SELECT count(*) FROM transactions t2 WHERE t2.customer_id = t1.customer_id AND t2.date > DATE_SUB(t1.date, INTERVAL 1 HOUR) AND t2.date < t1.date) previous_1hour
  , (SELECT count(*) FROM transactions t2 WHERE t2.customer_id = t1.customer_id AND t2.date > DATE_SUB(t1.date, INTERVAL 1 DAY) AND t2.date < t1.date) previous_1day
  , (SELECT count(*) FROM transactions t2 WHERE t2.customer_id = t1.customer_id AND t2.date > DATE_SUB(t1.date, INTERVAL 1 WEEK) AND t2.date < t1.date) previous_1week
FROM transactions t1
WHERE should_aggregate

You can see both of these options generating your expected output in this Fiddle.