Search code examples
sqlpostgresqldatetimegreatest-n-per-grouptimescaledb

Postgres: Get value of a column corresponding to max of other column in a group


I am trying to write a postgres query which returns max, min, median, first and last values in a group along with the timestamp column for each aggregate value

Table

Id Timestamp_utc                  Value
1  2020-11-05 15:36:15.768388     10
1  2020-11-05 15:40:15.768388     20
1  2020-11-05 15:44:15.768388     30
1  2020-11-05 15:45:15.768388.    5
1  2020-11-05 15:59:15.768388     25
1  2020-11-05 16:59:15.768388     25

Expected Result

Id Median Median_Timestamp Min Min_Timestamp               Max Max_TimeStamp
1  17.5.  15:44:15.768388  5   2020-11-05 15:45:15.768388  30   2020-11-05 15:44:15.768388

I have this query which groups data doesn't include the timestamp

SELECT Id, time_bucket('60', timestamp_utc) AS bucket,
percentile_cont(0.5) within group (order by value) median_value,
min(value) min_value, 
max(value) max_value 
FROM rs.MyTable 
WHERE id IN ( 1111,123)
AND timestamp_utc Between '2020-11-05 10:00:15.748643' and '2020-11-05 16:35:48.750313'
GROUP BY id, bucket 
ORDER BY id, bucket

Is there a way I could get timestamp column as well for the aggregated value like timestamp_utc col data when the value is maximum?


Solution

  • One option uses window functions in a subquery to rank the timestamps by increasing and descending value, then conditional aggregation in the outer query to bring the relevant values

    select id, bucket,
        percentile_cont(0.5) within group (order by value) median_value,
        min(value) min_value, 
        max(timestamp_utc) filter(where rn_asc = 1) min_timestamp,
        max(value) max_value,
        max(timestamp_utc) filter(where rn_desc = 1) max_timestamp
    from (
        select t.*, 
            row_number() over(partition by id, bucket order by value) rn_asc,
            row_number() over(partition by id, bucket order by value desc) rn_desc
        from (
            select t.*, time_bucket('60', timestamp_utc) as bucket 
            from rs.mytable t
            where 
                id in (1111,123)
                and timestamp_utc between '2020-11-05 10:00:15.748643'::timestamp 
                                      and '2020-11-05 16:35:48.750313'::timestamp
        ) t
    ) t
    group by id, bucket 
    order by id, bucket
    

    Note that we need to compute the bucket first, and put it in the partition of the window function.