Using Standard SQL in Google BigQuery.
I have a table with 2 order types: A and B.
Id | Type | OrderDate
-----------------
1 | A | 2019-03-01
2 | B | 2019-03-04
3 | B | 2019-03-04
4 | A | 2019-03-05
5 | A | 2019-03-06
6 | B | 2019-04-05
For each Order type A, I want to work out when the next Order B was, ignoring all other Order A's.
So in my example data, if I would like to return the following:
Id | Type | NextOrderBDate
--------------------------------
1 | A | 2019-03-04
4 | A | 2019-04-05
5 | A | 2019-04-05
I did achieve the result extremely inefficiently by joining 2 separate tables of A and B to each other - but the dataset is very large and it took over an hour to run.
What I'm currently trying to do is use a LEAD statement like so:
SELECT Id, Type,
LEAD(OrderDate) OVER (PARTITION BY Id ORDER BY OrderDate)
FROM xxx
Obviously the problem here is that it will return the next date, regardless of order type.
I am wondering if the key to do this is to work out the correct offset that each row needs to LEAD to the next Order of Type B, and am struggling to find a (clean) solution here.
Thanks in advance.
@Gordon Linoff is right except small bug: next B order should be found related to each current order. So window of query should IMHO be adapted appropriately:
with t (id, type, orderdate) as (
select 1 , 'A' , date '2019-03-01' union
select 2 , 'B' , date '2019-03-04' union
select 3 , 'B' , date '2019-03-04' union
select 4 , 'A' , date '2019-03-05' union
select 5 , 'A' , date '2019-03-06' union
select 6 , 'B' , date '2019-04-05'
)
select t.*
from (select t.*,
min(case when type = 'B' then orderdate end)
over (order by orderdate
rows between current row and unbounded following
) as next_b_orderdate
from t
) t
where type = 'A';