Search code examples
sqlpostgresqlgreatest-n-per-group

Gather the last entry for a specific metric from multiple fields


Context

I have this table:

Time Metric A Metric B Value
A A A 1
A A B 2
A A C 1
B A A 1
B A B 2
B A C 1

I want to get the last and summarize the table by Metric A, but sometimes Time goes bonkers and GROUP BY Time is not an option.

What I want is to get the last value of Metric B for each Metric A. Any tips on how to make sure I get the last inserted value for Metric B for Metric A?

Desired output

Time Metric A Metric B Value
A A A 1
A A B 2
A A C 1

Solution

  • You could use rank() or dense_rank() to achieve what you're looking for.

    create table table1 (
      time timestamp, 
      metric_a varchar(1), 
      metric_b varchar(1), 
      value integer
      );
      
    insert into table1 values 
    (current_timestamp, 'A','A',1),
    (current_timestamp, 'A','B',2),
    (current_timestamp, 'A','C',1),
    (current_timestamp - interval '10 minutes', 'A','A',1), 
    (current_timestamp - interval '10 minutes', 'A','B',2), 
    (current_timestamp - interval '10 minutes', 'A','C',1), 
    (current_timestamp, 'B','A',2),
    (current_timestamp, 'B','B',3),
    (current_timestamp, 'B','C',4),
    (current_timestamp - interval '10 minutes', 'B','A',2), 
    (current_timestamp - interval '10 minutes', 'B','B',3), 
    (current_timestamp - interval '10 minutes', 'B','C',4);
    

    select time, metric_a, metric_b, value 
    from (
      select *, 
       dense_rank() over (partition by metric_a, metric_b order by time desc) as rnk
      from table1
      )z
    where rnk = 1;
    
    time metric_a metric_b value
    2023-08-14T15:13:49.623Z A A 1
    2023-08-14T15:13:49.623Z A B 2
    2023-08-14T15:13:49.623Z A C 1
    2023-08-14T15:13:49.623Z B A 2
    2023-08-14T15:13:49.623Z B B 3
    2023-08-14T15:13:49.623Z B C 4

    View on DB Fiddle