I have 3 activities in a single column (completed_order, email, viewed)
and I want to count, how many email
activities occurred in between each completed_order
activity and save them in a different column.
I wrote this Query:
SELECT activity_id, ts, customer, activity ,
case when activity = 'completed_order'
then count(*)filter (where activity = 'email' ) over (partition by customer order by ts )
else null end as Aggregate_in_between
FROM public.activity_stream as az1 where customer = 'Lehmanns Marktstand' order by ts ;
And I am getting the below result with the above query.
activity_id | ts | customer | activity | agg_in_btw |
---|---|---|---|---|
11089 | "1996-08-12 00:00:00+05" | "Lehmanns Marktstand" | "completed_order" | 0 |
10279 | "1996-08-13 00:00:00+05" | "Lehmanns Marktstand" | "completed_order" | 0 |
11077 | "1996-08-14 00:00:00+05" | "Lehmanns Marktstand" | "email" | |
11092 | "1996-08-17 00:00:00+05" | "Lehmanns Marktstand" | "viewed_page" | |
11088 | "1996-08-18 00:00:00+05" | "Lehmanns Marktstand" | "viewed_page" | |
10284 | "1996-08-19 00:00:00+05" | "Lehmanns Marktstand" | "completed_order" | 1 |
11078 | "1996-08-20 00:00:00+05" | "Lehmanns Marktstand" | "email" | |
11079 | "1996-08-21 00:00:00+05" | "Lehmanns Marktstand" | "email" | |
11080 | "1996-10-21 00:00:00+05" | "Lehmanns Marktstand" | "email" | |
10343 | "1996-10-31 00:00:00+05" | "Lehmanns Marktstand" | "completed_order" | 4 |
11090 | "1996-11-01 00:00:00+05" | "Lehmanns Marktstand" | "viewed_page" | |
11091 | "1996-11-02 00:00:00+05" | "Lehmanns Marktstand" | "email" | |
10497 | "1997-04-04 00:00:00+05" | "Lehmanns Marktstand" | "completed_order" | 5 |
10522 | "1997-04-30 00:00:00+05" | "Lehmanns Marktstand" | "completed_order" | 5 |
My Desired Result should be like this
activity_id | ts | customer | activity | agg_in_btw |
---|---|---|---|---|
11089 | "1996-08-12 00:00:00+05" | "Lehmanns Marktstand" | "completed_order" | 0 |
10279 | "1996-08-13 00:00:00+05" | "Lehmanns Marktstand" | "completed_order" | 1 |
11077 | "1996-08-14 00:00:00+05" | "Lehmanns Marktstand" | "email" | |
11092 | "1996-08-17 00:00:00+05" | "Lehmanns Marktstand" | "viewed_page" | |
11088 | "1996-08-18 00:00:00+05" | "Lehmanns Marktstand" | "viewed_page" | |
10284 | "1996-08-19 00:00:00+05" | "Lehmanns Marktstand" | "completed_order" | 3 |
11078 | "1996-08-20 00:00:00+05" | "Lehmanns Marktstand" | "email" | |
11079 | "1996-08-21 00:00:00+05" | "Lehmanns Marktstand" | "email" | |
11080 | "1996-10-21 00:00:00+05" | "Lehmanns Marktstand" | "email" | |
10343 | "1996-10-31 00:00:00+05" | "Lehmanns Marktstand" | "completed_order" | 1 |
11090 | "1996-11-01 00:00:00+05" | "Lehmanns Marktstand" | "viewed_page" | |
11091 | "1996-11-02 00:00:00+05" | "Lehmanns Marktstand" | "email" | |
10497 | "1997-04-04 00:00:00+05" | "Lehmanns Marktstand" | "completed_order" | 0 |
10522 | "1997-04-30 00:00:00+05" | "Lehmanns Marktstand" | "completed_order" | 0 |
Try this way:
select activity_id, ts, customer, activity ,
case when activity = 'completed_order' then
sum(case when activity='email' then 1 else 0 end) over (partition by customer,grp1)
else
null
end "Aggregate_in_between"
from (
select *, sum(grp) over (partition by customer order by ts) "grp1" from (
select *, case when activity='completed_order' then 1 else 0 end "grp" from activity_stream order by ts
) t ) q
Here you have to create a group of each completed_order
occurance and them count the email
in each group.
you can add where clause as per your convenience