I've got unstacked data in a Postgres database, trying to perform calculations across columns.
Here's my data:
group | obs1 | obs2 | obs3 | obs4 | obs5 |
---|---|---|---|---|---|
1 | 74.030 | 74.002 | 74.019 | 73.992 | 74.008 |
2 | 73.995 | 73.992 | 74.001 | 74.011 | 74.004 |
3 | 73.988 | 74.024 | 74.021 | 74.005 | 74.002 |
4 | 74.002 | 73.996 | 73.993 | 74.015 | 74.009 |
Trying to get the average and range (Max()-Min()) across columns. Below is the desired outcome:
| group | obs1 | obs2 | obs3 | obs4 | obs5|grp_avg| grp_range|
|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:------:|
|1 |74.030 | 74.002| 74.019| 73.992| 74.008| 74.010| 0.038 |
|2 |73.995 | 73.992| 74.001| 74.011| 74.004| 74.001| 0.019 |
|3 |73.988 | 74.024| 74.021| 74.005| 74.002| 74.008| 0.036 |
|4 |74.002 | 73.996| 73.993| 74.015| 74.009| 74.003| 0.022 |
I looked at this question and don't understand, but don't believe it solves this. Based on that question it pointed to Postgres documentation but it seems way more complicated than I am looking to do.
In a Pandas DataFrame
(how I'm doing this now) there is the option of changing the axis
parameter, causing functions to work on columns rather than rows. I'm hopeful Postgres has some kind of similar functionality.
Basically, the quoted accepted solution answers your question. To understand it better, it can be represented with a common table expression. This has the advantage that the query is created step by step and you can check the results at each stage (see db<>fiddle).
with one_column as (
select
grp,
unnest(array[obs1, obs2, obs3, obs4, obs5]) as elem
from my_table
),
aggs as (
select
grp,
avg(elem)::numeric(20, 3),
max(elem)- min(elem) as rng
from one_column
group by grp
)
select
grp,
obs1, obs2, obs3, obs4, obs5,
avg,
rng
from aggs
join my_table using(grp)
order by grp;
Alternatively, you can just calculate the aggregates using the functions greatest() and least().
select
grp, obs1, obs2, obs3, obs4, obs5,
((obs1+ obs2+ obs3+ obs4+ obs5)/ 5)::numeric(20, 3) as avg,
greatest(obs1, obs2, obs3, obs4, obs5)- least(obs1, obs2, obs3, obs4, obs5) as rng
from my_table;