Here's an example of the data set layout: https://dbfiddle.uk/?rdbms=sqlserver_2016&fiddle=b537554bb5e9dbde4f6c662fc302db5f
I want to write a select query to get the last product that arrived for each Vendor, by date.
Sample data:
Line | date | Vendor | Product_Name | Arrival_Time ---: | :---------------------- | :----------------- | :----------- | :----------- 1 | 2020-09-01 00:00:00.000 | Fruits Fruit Fuits | Bananas | 14:30:00 2 | 2020-09-01 00:00:00.000 | Fruits Fruit Fuits | Apples | 13:30:00 3 | 2020-09-01 00:00:00.000 | Fruits Fruit Fuits | Oranges | 08:30:00 4 | 2020-09-02 00:00:00.000 | Fruits Fruit Fuits | Apples | 15:30:00 5 | 2020-09-02 00:00:00.000 | Fruits Fruit Fuits | Oranges | 12:30:00 6 | 2020-09-02 00:00:00.000 | Fruits Fruit Fuits | Bananas | 04:30:00 7 | 2020-09-01 00:00:00.000 | Fruits & More | Bananas | 21:30:00 8 | 2020-09-01 00:00:00.000 | Fruits & More | Apples | 00:30:00 9 | 2020-09-01 00:00:00.000 | Fruits & More | Oranges | 05:30:00 10 | 2020-09-02 00:00:00.000 | Fruits & More | Apples | 23:30:00 11 | 2020-09-02 00:00:00.000 | Fruits & More | Oranges | 15:30:00 12 | 2020-09-02 00:00:00.000 | Fruits & More | Bananas | 01:30:00
Desired results:
Line | date | Vendor | Product_Name | Arrival_Time ---: | :---------------------- | :----------------- | :----------- | :----------- 1 | 2020-09-01 00:00:00.000 | Fruits Fruit Fuits | Bananas | 14:30:00 4 | 2020-09-02 00:00:00.000 | Fruits Fruit Fuits | Apples | 15:30:00 7 | 2020-09-01 00:00:00.000 | Fruits & More | Bananas | 21:30:00 10 | 2020-09-02 00:00:00.000 | Fruits & More | Apples | 23:30:00
So, in the example provided, I would want the query to return Line 1 (14:30 is the latest time on 9/1 for Fruits Fruits Fruits), Line 7 (21:30 is the latest time for Fruits & More on 9/1, Line 4 (15:30 is the latest time on 9/1 for Fruits Fruits Fruits), etc.
I think I'm missing a subquery or maybe I need to fiddle with my group by.
One approach uses a correlated subquery:
select a.*
from availability a
where a.arrival_time = (
select max(a1.arrival_time)
from availability a1
where a1.vendor = a.vendor
and a1.date >= convert(date, a.date)
and a1.date < dateadd(day, 1, convert(date, a.date))
);
The conditions on the dates could be expressed more simply as convert(date, a.date) = convert(date, a1.date)
, but the above expressions are likely to be more efficient, especially with an index on (vendor, date, arrival_time)
.
Another typical approach is window functions:
select *
from (
select a.*,
row_number() over(partition by a.vendor, convert(date, a.date) order by a.arrival_time desc) rn
from availability a
) a
where rn = 1