Search code examples
sqlpostgresqlcountmultiple-columnspostgresql-12

SQL / Postgresql count multiple columns with conditions


I have a simple table of the form:

id gender a_feature (bool) b_feature (bool) ... xyz_feature (bool)

and I want to sum over all feature columns dependent on gender.

metric male female
a_feature 345 3423
b_feature 65 143
... ... ...
xyz_feature 133 5536

Is there a simple way to do this, e.g. using the information_schema.

I found only the solution below, but this is very ugly:

select
       'a_feature' as feature_name,
       count(case a_feature and gender = 'male') as male,
       count(case a_feature and gender = 'female') as female
from table
union
select
       b_feature as feature_name,
       count(case b_feature and gender = 'male') as male,
       count(case b_feature and gender = 'female') as female
from table
.
.
.
select
       xyz_feature as feature_name,
       count(case xyz_feature and gender = 'male') as male,
       count(case xyz_feature and gender = 'female') as female
from table

Solution

  • You can unpivot and aggregate. One method is:

    select name,
           sum(case when feature and gender = 'male' then 1 else 0 end) as num_male,
           sum(case when feature and gender = 'female' then 1 else 0 end) as num_female
    from ((select 'a_feature' as name, a_feature as feature, gender
           from t
          ) union all
          (select 'b_feature' as name, b_feature, gender
           from t
          ) union all
          . . .
         ) f
    group by name;
    

    In Postgres, you would unpivot using a lateral join:

    select name,
           sum(case when feature and gender = 'male' then 1 else 0 end) as num_male,
           sum(case when feature and gender = 'female' then 1 else 0 end) as num_female
    from t cross join lateral
         (values ('a_feature', a_feature),
                 ('b_feature', b_feature),
                 . . .
         ) v(name, feature)
    group by name;
    

    You can generate the list for values() using information_schema.columns if you are reluctant to type it all in.

    EDIT:

    You can construct the values clause using something like this:

    select string_agg('(''' || column_name || ''', column_name)', ', ') 
    from information_schema.columns
    where table_name = ?