Search code examples
sqlleadgoogle-bigquery

SQL - Using LEAD to skip over rows with a certain condition


Using Standard SQL in Google BigQuery.

I have a table with 2 order types: A and B.

Id | Type | OrderDate
-----------------
1  |  A   | 2019-03-01
2  |  B   | 2019-03-04
3  |  B   | 2019-03-04
4  |  A   | 2019-03-05
5  |  A   | 2019-03-06
6  |  B   | 2019-04-05

For each Order type A, I want to work out when the next Order B was, ignoring all other Order A's.

So in my example data, if I would like to return the following:

Id | Type | NextOrderBDate
--------------------------------
1  |  A   | 2019-03-04
4  |  A   | 2019-04-05
5  |  A   | 2019-04-05

I did achieve the result extremely inefficiently by joining 2 separate tables of A and B to each other - but the dataset is very large and it took over an hour to run.

What I'm currently trying to do is use a LEAD statement like so:

SELECT Id, Type,
    LEAD(OrderDate) OVER (PARTITION BY Id ORDER BY OrderDate)
FROM xxx

Obviously the problem here is that it will return the next date, regardless of order type.

I am wondering if the key to do this is to work out the correct offset that each row needs to LEAD to the next Order of Type B, and am struggling to find a (clean) solution here.

Thanks in advance.


Solution

  • @Gordon Linoff is right except small bug: next B order should be found related to each current order. So window of query should IMHO be adapted appropriately:

    with t (id, type, orderdate) as (
      select 1  ,  'A'   , date '2019-03-01' union
      select 2  ,  'B'   , date '2019-03-04' union
      select 3  ,  'B'   , date '2019-03-04' union
      select 4  ,  'A'   , date '2019-03-05' union
      select 5  ,  'A'   , date '2019-03-06' union
      select 6  ,  'B'   , date '2019-04-05'
    )
    select t.*
    from (select t.*,
                 min(case when type = 'B' then orderdate end)
                 over (order by orderdate 
                     rows between current row and unbounded following
                 ) as next_b_orderdate
          from t
         ) t
    where type = 'A';