Search code examples
sqldatabasepostgresqlgreatest-n-per-group

Select rows based on last date


I have a table named Course in a Postgres database:

sample data for table Course

How can I select rows which have course name with latest date? I mean if I have two same course names for one ID, I should only show the latest one as the below result.

Simply, I want only to show the latest row per ("ID", "Course Name").

desired result

And what if I have two date columns in table Course, which are StartDate & EndDate and I want to show the same based on EndDate only?


Solution

  • In PostgreSQL, to get unique rows for a defined set of columns, the preferable technique is generally DISTINCT ON:

    SELECT DISTINCT ON ("ID") *
    FROM   "Course"
    ORDER  BY "ID", "Course Date" DESC NULLS LAST, "Course Name";
    

    Assuming you actually use those unfortunate upper case identifiers with spaces.

    You get exactly one row per ID this way - the one with the latest known "Course Date" and the first "Course Name" (according to sort order) in case of ties on the date.

    You can drop NULLS LAST if your column is defined NOT NULL.

    To get unique rows per ("ID", "Course Name"):

    SELECT DISTINCT ON ("ID", "Course Name") *
    FROM   "Course"
    ORDER  BY "ID", "Course Name", "Course Date" DESC NULLS LAST;
    

    There are faster query techniques for many rows per group. Further reading: