Basically, my requirement is - for a given month, how many customers had their "previous Sale date" 3 months before the given month and of these customers how many of them have a "Sale date" in the given month.
I tried using Lag function, but my column "Reactivated_Guests" is giving me null value always.
SELECT datepart(month,["sale date"]) `"Sale_Month",count(distinct
["user id"]) "Lost_Guests",
lag("Guests",4) OVER (ORDER BY "Sale_Month")+
lag("Guests",5) OVER (ORDER BY "Sale_Month")+
lag("Guests",6) OVER (ORDER BY "Sale_Month")+
lag("Guests",7) OVER (ORDER BY "Sale_Month")+
lag("Guests",8) OVER (ORDER BY "Sale_Month")+
lag("Guests",9) OVER (ORDER BY "Sale_Month")+
lag("Guests",10) OVER (ORDER BY "Sale_Month")+
lag("Guests",11) OVER (ORDER BY "Sale_Month")+
lag("Guests",12) OVER (ORDER BY "Sale_Month") "Reactivated_Guests"
group by "Sale_Month"
order by "Sale_Month"
My expected output is month-wise # of guests that have their previous "Sale date" greater than 3 months before the given month (Lost_Guests) and of these customers how many have a "Sale date" in the given month (Reactivated_Guests)
Expected Result :
Sale_Month Lost_Guests Reactivated_Guests
(prev Sale date > 3 months) (Prev Sale date > 3 months and
have a Sale date in given month)
June 1,200 110
July 1,800 130
Aug 1,900 140
Actual Result :
Sale_Month Lost_Guests Reactivated_Guests
June 1,200 null
July 1,800 null
Aug 1,900 null
Sample Data :
Customer Sale Date
AAAAA 11/15/2018
BBBBB 11/16/2018
CCCCC 9/23/2018
CCCCC 1/25/2019
AAAAA 3/16/2019 ----> so for given month of March, AAAAA to be
CCCCC 3/18/2019 considered in "Lost_Guests" because
AAAAA's previous sale date (11/15/2018) is
more than 3 months from the given month
(March - 2019) and AAAAA to be considered in
"Reactivated_guests" because AAAAA has a
Sale date in the given month (March-2019)
----> for given month of March, CCCCC shall not
be considered in "Lost guests" and
"Reactivated Guests" because
previous sale date (1/25/2019) is less
than 3 months from given month (March-2019)
and hence does not appear in
"Reactivated_Guests" as well
This addresses the original version of the question.
You seem to want something like this:
select sale_month, count(distinct user_id) as guests,
count(distinct case when min_sale_date < sale_date - interval '3 month' then user_id end) as old_guests
from (select t.*,
min(sale_date) over (partition by user_id) as min_sale_date
from t
) t
group by sale_month
order by sale_month;
Note that date functions are very database dependent, so the exact syntax might vary depending on your database.