I'm having issues trying to wrap my head around how to extract some time series stats from my Postgres DB.
For example, I have several stores. I record how many sales each store made each day in a table that looks like:
+------------+----------+-------+
| Date | Store ID | Count |
+------------+----------+-------+
| 2017-02-01 | 1 | 10 |
| 2017-02-01 | 2 | 20 |
| 2017-02-03 | 1 | 11 |
| 2017-02-03 | 2 | 21 |
| 2017-02-04 | 3 | 30 |
+------------+----------+-------+
I'm trying to display this data on a bar/line graph with different lines per Store and the blank dates filled in with 0.
I have been successful getting it to show the sum per day (combining all the stores into one sum) using generate_series
, but I can't figure out how to separate it out so each store has a value for each day... the result being something like:
["Store ID 1", 10, 0, 11, 0]
["Store ID 2", 20, 0, 21, 0]
["Store ID 3", 0, 0, 0, 30]
It is necessary to build a cross join dates X stores:
select store_id, array_agg(total order by date) as total
from (
select store_id, date, coalesce(sum(total), 0) as total
from
t
right join (
generate_series(
(select min(date) from t),
(select max(date) from t),
'1 day'
) gs (date)
cross join
(select distinct store_id from t) s
) using (date, store_id)
group by 1,2
) s
group by 1
order by 1
;
store_id | total
----------+-------------
1 | {10,0,11,0}
2 | {20,0,21,0}
3 | {0,0,0,30}
Sample data:
create table t (date date, store_id int, total int);
insert into t (date, store_id, total) values
('2017-02-01',1,10),
('2017-02-01',2,20),
('2017-02-03',1,11),
('2017-02-03',2,21),
('2017-02-04',3,30);