I want to only show orders where the customer had an order for the same exact item, but only the most recent, completed order placed before.
I want to get the most immediate order placed for the same customer, for the same item. Like showing duplicates, but just the most recent.
The query works fine in accomplishing what I want it to do, but when I add the cross apply to my actual query, it slows it down by a LOT.
EDIT: I've also tried select top 1 rather than using the row number. The rownumber line makes it only 1 second faster.
declare @orders as table (
ord_id numeric(7,0),
customer_id numeric(4,0),
order_time datetime,
item_id numeric (4,0),
status int NOT NULL
)
insert into @orders values
(1516235,5116,'06/04/2021 11:06:00', 5616, 1),
(1516236,5116,'06/03/2021 13:51:00', 5616, 1),
(1514586,5554,'06/01/2021 08:16:00', 5616, 1),
(1516288,5554,'06/01/2021 15:35:00', 5616, 1),
(1516241,5554,'06/04/2021 11:11:00', 4862, 1),
(1516778,5554,'06/04/2021 11:05:00', 4862, 2)
select distinct *
from @orders o
cross apply (
select a.ord_id, row_number() over (partition by a.customer_id order by a.order_time) as rownum
from @orders a
where a.customer_id = o.customer_id and
a.status != 2 and
a.item_id = o.item_id and
a.order_time < o.order_time
)a
where a.rownum = 1
Is there some other way I can do this? How can I speed this up?
The previous order has to have
That's silly. Here's a simpler method using cross apply
:
select o.*
from @orders o cross apply
(select top (1) a.ord_id
from @orders a
where a.customer_id = o.customer_id and
a.status <> 2 and
a.item_id = o.item_id and
a.order_time < o.order_time
order by a.order_time
) a;
This can use an index on (customer_id, item_id, status, order_time)
.
Note: If you want the most recent of the previous order, then the order by
should use desc
. However, that is not how the code is phrased in the question.
And, you should be able to use window functions. If ord_id
increases with time:
min(case when status <> 2 then ord_id end) over (partition by customer_id, item_id)
Even if this is not true, there is a variation, but it is more complicated (i.e. requires a subquery) because of the filtering on status.