I have a table that contains the following information:
id | amount | date | customer_id
1 | 0.00 | 11/12/17 | 1
2 | 54.00 | 11/12/17 | 1
3 | 60.00 | 02/12/18 | 1
4 | 0.00 | 01/18/17 | 2
5 | 14.00 | 03/12/17 | 2
6 | 24.00 | 02/22/18 | 2
7 | 0.00 | 09/12/16 | 3
8 | 74.00 | 10/01/17 | 3
What I need it to look like is the following:
ranked_id | id | amount | date | customer_id
1 | 1 | 0.00 | 11/12/17 | 1
2 | 2 | 54.00 | 11/12/17 | 1
3 | 3 | 60.00 | 02/12/18 | 1
4 | 3 | 60.00 | 02/12/18 | 1
5 | 3 | 60.00 | 02/12/18 | 1
6 | 3 | 60.00 | 02/12/18 | 1
7 | 3 | 60.00 | 02/12/18 | 1
8 | 4 | 0.00 | 01/18/17 | 2
9 | 5 | 14.00 | 03/12/17 | 2
10 | 6 | 24.00 | 02/22/18 | 2
11 | 6 | 24.00 | 02/22/18 | 2
12 | 6 | 24.00 | 02/22/18 | 2
13 | 6 | 24.00 | 02/22/18 | 2
14 | 6 | 24.00 | 02/22/18 | 2
15 | 7 | 0.00 | 09/12/16 | 3
16 | 8 | 74.00 | 10/01/17 | 3
17 | 8 | 74.00 | 10/01/17 | 3
18 | 8 | 74.00 | 10/01/17 | 3
19 | 8 | 74.00 | 10/01/17 | 3
20 | 8 | 74.00 | 10/01/17 | 3
21 | 8 | 74.00 | 10/01/17 | 3
I know that there's something with partitioning and ranking (on the ranked_id), but I can't figure out how to repeat the last row 7 times.
As @Gordon Linoff suggested you can use the generate_series() function crossed with the distinct customer_ids to generate all the rows needed as in T1 below. Then in T2 (also below) the row_number function is used to generate a sequential value to outer join to from t1 along with the customer_id.
From there it's just a matter of being able to get at the last value per customer_id when there is no original data to join to which is where the case statement and analytic first_value functions come in. I couldn't get the last_value analytic function to work likely due to postgresql's lack of an ignore nulls directive, so I used first_Value with a descending sort order, and only return the analytic value when no other data exists.
with t1 as (
select distinct
dense_rank() over (order by customer_id, generate_series) ranked_id
, customer_id
, generate_series
from table1
cross join generate_series(1,7)
), t2 as (
select row_number() over (partition by customer_id order by id) rn
, table1.*
from table1
)
select t1.ranked_id
, case when t2.customer_id is not null
then t2.id
else first_value(t2.id)
over (partition by t1.customer_id
order by id desc nulls last)
end id
, case when t2.customer_id is not null
then t2.amount
else first_value(t2.amount)
over (partition by t1.customer_id
order by id desc nulls last)
end amount
, case when t2.customer_id is not null
then t2.date
else first_value(t2.date)
over (partition by t1.customer_id
order by id desc nulls last)
end date
, t1.customer_id
from t1
left join t2
on t2.customer_id = t1.customer_id
and t2.id = t1.generate_series
order by ranked_id;
Here's a SQL Fiddle demonstrating the code.