Search code examples
postgresqlaggregate-functionscalculated-columns

Postgres Functions Across Columns


I've got unstacked data in a Postgres database, trying to perform calculations across columns.

Here's my data:

group obs1 obs2 obs3 obs4 obs5
1 74.030 74.002 74.019 73.992 74.008
2 73.995 73.992 74.001 74.011 74.004
3 73.988 74.024 74.021 74.005 74.002
4 74.002 73.996 73.993 74.015 74.009

Trying to get the average and range (Max()-Min()) across columns. Below is the desired outcome:

| group | obs1  |  obs2 |  obs3 | obs4  | obs5|grp_avg| grp_range|
|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:------:|
|1      |74.030 | 74.002| 74.019| 73.992| 74.008| 74.010| 0.038 |
|2      |73.995 | 73.992| 74.001| 74.011| 74.004| 74.001| 0.019 |
|3      |73.988 | 74.024| 74.021| 74.005| 74.002| 74.008| 0.036 |
|4      |74.002 | 73.996| 73.993| 74.015| 74.009| 74.003| 0.022 |

I looked at this question and don't understand, but don't believe it solves this. Based on that question it pointed to Postgres documentation but it seems way more complicated than I am looking to do.

In a Pandas DataFrame (how I'm doing this now) there is the option of changing the axis parameter, causing functions to work on columns rather than rows. I'm hopeful Postgres has some kind of similar functionality.


Solution

  • Basically, the quoted accepted solution answers your question. To understand it better, it can be represented with a common table expression. This has the advantage that the query is created step by step and you can check the results at each stage (see db<>fiddle).

    with one_column as (
        select 
            grp, 
            unnest(array[obs1, obs2, obs3, obs4, obs5]) as elem
        from my_table
    ),
    aggs as (
        select 
            grp, 
            avg(elem)::numeric(20, 3), 
            max(elem)- min(elem) as rng
        from one_column
        group by grp
    )
    select 
        grp, 
        obs1, obs2, obs3, obs4, obs5,
        avg, 
        rng
    from aggs
    join my_table using(grp)
    order by grp;
    

    Alternatively, you can just calculate the aggregates using the functions greatest() and least().

    select 
        grp, obs1, obs2, obs3, obs4, obs5,
        ((obs1+ obs2+ obs3+ obs4+ obs5)/ 5)::numeric(20, 3) as avg,
        greatest(obs1, obs2, obs3, obs4, obs5)- least(obs1, obs2, obs3, obs4, obs5) as rng
    from my_table;
    

    Db<>fiddle.