My goal is to find how many customers order more than once between the criteria below. Or to put into other terms, how long it takes for customers to place their next order with these criteria:
Table is set up by line item on each order. like this:
Customer | Item | OrderNumber | OrderDate
1500 item1 5555 2015-02-01
1500 item2 5555 2015-02-01
1500 item34 5255 2014-05-25
1500 item44 4100 2012-12-30
2200 item55 5100 2014-02-15
2200 item1 5100 2014-02-15
3255 item12 5300 2015-03-05
3255 item34 5399 2014-05-01
3255 item22 5399 2014-05-01
So if it takes less than 12 mos for a customer to order more than once then it should be counted towards the "0-12 mos". If a customer took 18 mos to place their next order they would be counted towards the "13-24 mos" and so on and so forth.
I don't really know where to begin on this one. I probably will have to at least have: HAVING COUNT(DISTINCT OrderNumber) > 1
. I have never used LAG
, is this something that I should utilize a MySQL variant of to find the next OrderDate
in the sequence?
Any help would be appreciated to at least start to identify the components of the query needed.
If you just wanted the average time between orders, you can do something like this:
select floor(days_between / 365) as numyears, count(*)
from (select customer, datediff(max(orderdate), min(orderdate)) as days_between
count(*) as numorders
from orders
group by customer
having count(*) >= 2
) c
group by numyears;
If you really want to understand the timing between orders, learn about survival analysis, particularly recurrent event analysis. Your question, although well-enough formed, is rather naive because it does not consider customers with only one order nor the time since a customer's last order.