Search code examples
sqldatetimegoogle-bigqueryfrequency

How to calculate date frequency in bigquery for purchase history


I have the following table structure:

email | date  | YEAR

antrim@8qwh37kibb6ut7.ml  | 2018-10-25   | 2018    
asam@bpvi.ml              | 2018-02-15   | 2018   
client5@test.com          | 2018-04-14   | 2018      
antrim@8qwh37kibb6ut7.ml  | 2017-07-19   | 2017      
client4@test.com          | 2017-09-30   | 2017         
asam@bpvi.ml              | 2017-01-25   | 2017        
antrim@8qwh37kibb6ut7.ml  | 2017-10-25   | 2017        
asam@bpvi.ml              | 2016-10-25   | 2016        
antrim@8qwh37kibb6ut7.ml  | 2016-05-17   | 2016        

First I need the customers that only bought on the three years, the ones that repeated. So for that I did a self join three times although I suspect is not the most efficient

SELECT COUNT(DISTINCT first_year.email) as repetidos
 FROM `table` AS first_year
 JOIN `table` AS second_year
  ON first_year.email = second_year.email
 JOIN `table` AS third_year
  ON first_year.email = third_year.email
 AND first_year.year = 2016
 WHERE second_year.year = 2017
 AND third_year.year = 2018

And this gives me the emails of the people who have bought on the three years. Like so

email

antrim@8qwh37kibb6ut7.ml  
asam@bpvi.ml              

I need to know the frequency in days, meaning, If you have bought 5 times in the store I need to know the frequency of your purchases, if you buy every 40 days, every 20 days and so, I found this

SELECT DATE_DIFF(DATE '2018-02-20', DATE '2018-01-15', DAY) as days_diff;

But what I don't know how to do is use that formula with my previous query. Is that possible? Thank you very much!


Solution

  • I think you might want:

    SELECT t.email,
           DATE_DIFF(MAX(date), MIN(date), day) / NULLIF(COUNT(*) - 1, 0) as avg_date_diff
    FROM `table` t
    WHERE t.year IN (2016, 2017, 2018)
    GROUP BY t.email
    HAVING COUNT(DISTINCT t.year) = 3