Search code examples
sqloracle-databasegreatest-n-per-group

sql select everything with maximum date (that that is smaller than a specific date) without subqueries


I would like to write a sql query where I choose all rows grouped by id where the column date is the latest date for this id but still smaller than for example 16-JUL-2021. I would like to do this without using subqueries (in oracle), is that possible? I tried the below but it doesn't work.

SELECT *, max(date) 
WHERE date < '16-JUL-2021'
OVER(PARTITION BY id ORDER BY date DESC) as sth
FROM table

Solution

  • You can find the maximum date without sub-queries.

    SELECT t.*,
           max("DATE") OVER(PARTITION BY id ORDER BY "DATE" DESC) as max_date
    FROM   "TABLE" t
    WHERE  "DATE" < DATE '2021-07-16'
    

    You need a sub-query to filter to only show the row(s) with the maximum date:

    SELECT *
    FROM   (
      SELECT t.*,
             max("DATE") OVER(PARTITION BY id ORDER BY "DATE" DESC) as max_date
      FROM   "TABLE" t
      WHERE  "DATE" < DATE '2021-07-16'
    )
    WHERE  "DATE" = max_date;
    

    However, you are still only querying the table once using this technique even though it uses a sub-query.

    Note DATE and TABLE are reserved words and cannot be used as unquoted identifiers; it would be better practice to use different names for those identifiers.

    You could, equivalently use the RANK or DENSE_RANK analytic functions instead of MAX; ROW_NUMBER, however, does not give the same output as it will only return a single row and will not return all tied rows.

    SELECT *
    FROM   (
      SELECT t.*,
             RANK() OVER(PARTITION BY id ORDER BY "DATE" DESC) as rnk
      FROM   "TABLE" t
      WHERE  "DATE" < DATE '2021-07-16'
    )
    WHERE  rnk = 1;
    

    But you still need a sub-query to filter the rows.


    If you want to not use a sub-query then you can use:

    SELECT id,
           MAX("DATE") AS "DATE",
           MAX(col1) KEEP (DENSE_RANK LAST ORDER BY "DATE", ROWNUM) AS col1,
           MAX(col2) KEEP (DENSE_RANK LAST ORDER BY "DATE", ROWNUM) AS col2,
           MAX(col3) KEEP (DENSE_RANK LAST ORDER BY "DATE", ROWNUM) AS col3
    FROM   "TABLE"
    GROUP BY id
    

    However, that is not quite the same as it will only get a single row per id and will not return multiple rows tied for the greatest date per id.