Search code examples
sqlaverageaggregate-functionsunpivot

SQL GROUP BY where either column has same value


I have the following table

User A | User B | Value
-------+--------+------
   1   |    2   |  60
   3   |    1   |  10
   4   |    5   |  50
   3   |    5   |  50  
   5   |    1   |  80
   2   |    3   |  10

I want group together records where either user a = x or user b = x, in order to find averages.

e.g. User 1 appears in the table 3 times, once as 'User A' and twice as 'User B'. So I would want to carry out my AVG() function using those three rows.

I need the highest and lowest average values. Such a query would break down the above table into the following groups:

User | Avg Value
-----+-----
  1  | 50
  2  | 35
  3  | 23.33
  4  | 50
  5  | 60

and then return

Highest Avg | Lowest Avg
------------+-----------
     60     |   23.33

I know that GROUP BY collects together records where a column has the same value. I want to collect together records where either one of two columns has the same value. I have searched through many solutions but can't seem to find one that meets my problem.


Solution

  • A portable option uses union all:

    select usr, avg(value) avg_value
    from (
        select usera usr, value from mytable
        union all select userb, value from mytable
    ) t
    group by usr
    

    This gives you the first resultset. Then, you can add another level of aggregataion to get the maximum and minimum average:

    select min(avg_value) min_avg_value, max(avg_value) max_avg_value
    from (
        select usr, avg(value) avg_value
        from (
            select usera usr, value from mytable
            union all select userb, value from mytable
        ) t
        group by usr
    ) t
    

    In databases that support lateral joins and values(), this is most convinently (and efficiently) expressed as follows:

    select min(avg_value) min_avg_value, max(avg_value) max_avg_value
    from (
        select usr, avg(value) avg_value
        from mytable t
        cross join lateral (values (usera, value), (userb, value)) as x(usr, value)
        group by usr
    ) t
    

    This would work in Postgres for example. In SQL Server, you would just replace cross join lateral with cross apply.