Search code examples
sqlsql-serverdatetimegreatest-n-per-groupwindow-functions

Using MAX to find the lastest time given other attribtues


Here's an example of the data set layout: https://dbfiddle.uk/?rdbms=sqlserver_2016&fiddle=b537554bb5e9dbde4f6c662fc302db5f

I want to write a select query to get the last product that arrived for each Vendor, by date.

Sample data:

Line | date                    | Vendor             | Product_Name | Arrival_Time
---: | :---------------------- | :----------------- | :----------- | :-----------
   1 | 2020-09-01 00:00:00.000 | Fruits Fruit Fuits | Bananas      | 14:30:00    
   2 | 2020-09-01 00:00:00.000 | Fruits Fruit Fuits | Apples       | 13:30:00    
   3 | 2020-09-01 00:00:00.000 | Fruits Fruit Fuits | Oranges      | 08:30:00    
   4 | 2020-09-02 00:00:00.000 | Fruits Fruit Fuits | Apples       | 15:30:00    
   5 | 2020-09-02 00:00:00.000 | Fruits Fruit Fuits | Oranges      | 12:30:00    
   6 | 2020-09-02 00:00:00.000 | Fruits Fruit Fuits | Bananas      | 04:30:00    
   7 | 2020-09-01 00:00:00.000 | Fruits & More      | Bananas      | 21:30:00    
   8 | 2020-09-01 00:00:00.000 | Fruits & More      | Apples       | 00:30:00    
   9 | 2020-09-01 00:00:00.000 | Fruits & More      | Oranges      | 05:30:00    
  10 | 2020-09-02 00:00:00.000 | Fruits & More      | Apples       | 23:30:00    
  11 | 2020-09-02 00:00:00.000 | Fruits & More      | Oranges      | 15:30:00    
  12 | 2020-09-02 00:00:00.000 | Fruits & More      | Bananas      | 01:30:00    

Desired results:

Line | date                    | Vendor             | Product_Name | Arrival_Time
---: | :---------------------- | :----------------- | :----------- | :-----------
   1 | 2020-09-01 00:00:00.000 | Fruits Fruit Fuits | Bananas      | 14:30:00    
   4 | 2020-09-02 00:00:00.000 | Fruits Fruit Fuits | Apples       | 15:30:00    
   7 | 2020-09-01 00:00:00.000 | Fruits & More      | Bananas      | 21:30:00    
  10 | 2020-09-02 00:00:00.000 | Fruits & More      | Apples       | 23:30:00   

So, in the example provided, I would want the query to return Line 1 (14:30 is the latest time on 9/1 for Fruits Fruits Fruits), Line 7 (21:30 is the latest time for Fruits & More on 9/1, Line 4 (15:30 is the latest time on 9/1 for Fruits Fruits Fruits), etc.

I think I'm missing a subquery or maybe I need to fiddle with my group by.


Solution

  • One approach uses a correlated subquery:

    select a.*
    from availability a
    where a.arrival_time = (
        select max(a1.arrival_time)
        from availability a1
        where a1.vendor = a.vendor 
            and a1.date >= convert(date, a.date) 
            and a1.date < dateadd(day, 1, convert(date, a.date))
    );
    

    The conditions on the dates could be expressed more simply as convert(date, a.date) = convert(date, a1.date), but the above expressions are likely to be more efficient, especially with an index on (vendor, date, arrival_time).

    Another typical approach is window functions:

    select *
    from (
        select a.*, 
            row_number() over(partition by a.vendor, convert(date, a.date) order by a.arrival_time desc) rn
        from availability a
    ) a
    where rn = 1