Search code examples
mysqlselectlagdatediff

MySQL Days Between One Order and Next Order Having > 1 Order


My goal is to find how many customers order more than once between the criteria below. Or to put into other terms, how long it takes for customers to place their next order with these criteria:

  • 0 - 12 mos
  • 13 - 24 mos
  • 25 - 36 mos
  • 37+ mos

Table is set up by line item on each order. like this:

Customer | Item  | OrderNumber | OrderDate
1500       item1   5555          2015-02-01
1500       item2   5555          2015-02-01
1500       item34  5255          2014-05-25
1500       item44  4100          2012-12-30
2200       item55  5100          2014-02-15
2200       item1   5100          2014-02-15
3255       item12  5300          2015-03-05
3255       item34  5399          2014-05-01
3255       item22  5399          2014-05-01

So if it takes less than 12 mos for a customer to order more than once then it should be counted towards the "0-12 mos". If a customer took 18 mos to place their next order they would be counted towards the "13-24 mos" and so on and so forth.

I don't really know where to begin on this one. I probably will have to at least have: HAVING COUNT(DISTINCT OrderNumber) > 1. I have never used LAG, is this something that I should utilize a MySQL variant of to find the next OrderDate in the sequence?

Any help would be appreciated to at least start to identify the components of the query needed.


Solution

  • If you just wanted the average time between orders, you can do something like this:

    select floor(days_between / 365) as numyears, count(*)
    from (select customer, datediff(max(orderdate), min(orderdate)) as days_between
                 count(*) as numorders
          from orders
          group by customer
          having count(*) >= 2
         ) c
    group by numyears;
    

    If you really want to understand the timing between orders, learn about survival analysis, particularly recurrent event analysis. Your question, although well-enough formed, is rather naive because it does not consider customers with only one order nor the time since a customer's last order.