I want to use two different (yet similar) window functions to calculate two values
SUM
and COUNT
on is_active over user_id+item
, only up to the time of the row - minus 1 hour.
My intuition was use ROWS UNBOUNDED PRECEDING
but that way I can't filter over the time
COUNT(1) OVER(PARTITION BY user_id, item ORDER BY req_time ROWS UNBOUNDED PRECEDING)
SUM(is_active) OVER(PARTITION BY user-id, item ORDER BY req_time ROWS UNBOUNDED PRECEDING)
However, that doesn't take under consideration the '1 hour ago' interval factor
Consider the following data:
user_id | req_time | item | is_active |
--------+--------------------+-------------------+---
1 | 2011-01-01 12:00:00| 1 | 0 |
1 | 2011-01-01 12:30:00| 1 | 1 |
1 | 2011-01-01 15:00:00| 1 | 1 |
1 | 2011-01-01 16:00:00| 1 | 0 |
1 | 2011-01-01 16:00:00| 2 | 0 |
1 | 2011-01-01 16:20:00| 2 | 1 |
2 | 2011-02-02 11:00:00| 1 | 1 |
2 | 2011-02-02 13:00:00| 1 | 0 |
1 | 2011-02-02 16:20:00| 1 | 0 |
1 | 2011-02-02 16:30:00| 2 | 0 |
I expect the following results: "value 1" is SUM(is_active) and "value 2" is COUNT(1):
user_id | req_time | item | value 1 | value 2 |
--------+--------------------+-----------------+---------+
1 | 2011-01-01 12:00:00| 1 | 0 | 0 |
1 | 2011-01-01 12:30:00| 1 | 0 | 0 |
1 | 2011-01-01 15:00:00| 1 | 1 | 2 |
1 | 2011-01-01 16:00:00| 1 | 2 | 3 |
1 | 2011-01-01 16:00:00| 2 | 0 | 0 |
1 | 2011-01-01 16:20:00| 2 | 0 | 0 |
2 | 2011-02-02 11:00:00| 1 | 0 | 0 |
2 | 2011-02-02 13:00:00| 1 | 1 | 1 |
1 | 2011-02-02 16:20:00| 1 | 2 | 4 |
1 | 2011-02-02 16:30:00| 2 | 1 | 2 |
I'm using Greenplum 4.21 which is based on Postgresql 8.2.15
Thanks in advance! gilibi
8.3 at SQL Fiddle. Only one subselect.
select user_id, req_time, item, v[1] as value1, v[2] as value2
from (
select t.*,
(
select array[
coalesce(sum(is_active::integer), 0),
count(*)
] as v
from t s
where
user_id = t.user_id
and item = t.item
and req_time <= t.req_time - interval '1 hour'
) as v
from t
) s
order by req_time, user_id, item