Search code examples
postgresqlhqlgreatest-n-per-group

Query for most recent record per group


I have a table name price:

CREATE TABLE price (
  price_id bigint pk,
  product_id bigint,
  start_date timestamp with timezone
)

with the query:

select price_id, product_id, price, start_date 
from price 
where start_date <= current_timestamp

I get this, sorted by UI:

price_id | product_id  | start_date
5 1 2018-12-31  17:00:00.000000
60  1   2019-07-18 17:00:00.000000
66  1   2019-07-19 17:00:00.000000
6   2   2018-12-31 17:00:00.000000
69  2   2019-07-19 17:00:00.000000
7   3   2018-12-31 17:00:00.000000
8   4   2018-12-31 17:00:00.000000
9   5   2018-12-31 17:00:00.000000
10  6   2018-12-31 17:00:00.000000
11  7   2018-12-31 17:00:00.000000
59  7   2019-07-18 17:00:00.000000
67  8   2019-07-19 17:00:00.000000
71  8   2019-07-22 17:00:00.000000
12  8   2018-12-31 17:00:00.000000
64  8   2019-07-17 17:00:00.000000
13  9   2018-12-31 17:00:00.000000
14  10  2018-12-31 17:00:00.000000
15  11  2018-12-31 17:00:00.000000
16  12  2018-12-31 17:00:00.000000
70  12  2019-07-21 17:00:00.000000
17  13  2018-12-31 17:00:00.000000
18  14  2018-12-31 17:00:00.000000
19  15  2018-12-31 17:00:00.000000
46  16  2019-07-05 17:00:00.000000
54  16  2019-07-17 17:00:00.000000
20  16  2019-06-01 07:46:25.737000
44  16  2019-07-04 17:00:00.000000
53  16  2019-07-15 17:00:00.000000
55  16  2019-07-18 17:00:00.000000
21  17  2018-12-31 17:00:00.000000
22  18  2018-12-31 17:00:00.000000
23  19  2018-12-31 17:00:00.000000
24  20  2018-12-31 17:00:00.000000
25  21  2018-12-31 17:00:00.000000
74  21  2019-07-22 17:00:00.000000
26  22  2018-12-31 17:00:00.000000
27  23  2018-12-31 17:00:00.000000
68  23  2019-07-20 17:00:00.000000
28  24  2018-12-31 17:00:00.000000
29  25  2018-12-31 17:00:00.000000
30  26  2018-12-31 17:00:00.000000
31  27  2018-12-31 17:00:00.000000
32  28  2018-12-31 17:00:00.000000
33  29  2018-12-31 17:00:00.000000
34  30  2018-12-31 17:00:00.000000
35  31  2018-12-31 17:00:00.000000
36  32  2018-12-31 17:00:00.000000
37  33  2018-12-31 17:00:00.000000
63  34  2019-07-17 17:00:00.000000
38  34  2018-12-31 17:00:00.000000
62  34  2019-07-14 17:00:00.000000
39  35  2018-12-31 17:00:00.000000
43  35  2019-06-01 07:46:25.737000
61  35  2019-07-21 17:00:00.000000

As you can see, there are multiple product_ids and for each product_id, there are multiple start_dates.

I need a postgre native query that can get only 1 start_date that is closest current_timestamp for each product_id.

translated to HQL after if possible. If not, postgre query only is fine.


Solution

  • demo:db<>fiddle

    The DISTINCT ON clause gives you the first record of an ordered group. The groups in your case are the productIds which are ordered by the start_date in descending order which brings the most recent one to the top of each group. This is taken by the the DISTINCT ON:

    SELECT DISTINCT ON (product_id)
        *
    FROM
        price
    ORDER BY product_id, start_date DESC