Search code examples
sqlpostgresqlaggregate-functions

Postgres Conditional Sum Function


Given the following table mileage, I should perform two sums for each vehicle_id: once summing all entries for miles and another - all entries for miles only where is_processed = false.

mileage

vehicle_id miles is_processed created_at
773 800 f 2023-04-17 13:11:59.100846
767 200 f 2023-04-24 11:29:52.503024
767 100 t 2023-04-11 15:14:33.125348
773 200 t 2023-04-11 15:49:34.313639
767 150 f 2023-04-24 11:43:31.496871

This would be simple enough with one of two methods:

Method 1

select vehicle_id,
  sum(miles) as total,
  (select sum(miles) from mileage t2 where t1.vehicle_id = t2.vehicle_id and is_processed = false) extra
from mileage t1
group by vehicle_id;

Method 2

Fetch all records and compute with a single for loop in my language of choice.

Desired solution

However, I want to know if there is a way to do all of that with PostgreSQL. I already tried window functions and could not figure it out, and grouping would involve a second scan of the same table which will hurt performance given that I expect to be processing hundreds of thousands of rows.

Any help is appreciated. Thanks!


Solution

  • Try a "conditional aggregate" which is really just putting a case expression inside the aggregate function.

    SELECT vehicle_id
        ,sum(miles) AS total
        ,sum(case when is_processed = false then miles else 0 end) AS extra
    FROM mileage t1
    GROUP BY vehicle_id;
    

    Keep in mind the case expression evaluates to a value on each row, so the aggregate is still just summing values.