Search code examples
sqlpostgresqlrelational-databaserdbms

Conditionally COUNT the Value of a Column in SQL and Account for NULL Value


I have a SQL query where I try to perform some conditions inside a COUNT(). I am using PostgreSQL. Here is my original query:

SELECT s.schedule_ID as "schedule_ID", schedule_start, schedule_end, spots, 
COUNT(CASE a.schedule_ID WHEN s.schedule_ID THEN TRUE ELSE NULL END) as "takenSpots" 
FROM schedules s, appointments a 
GROUP BY s.schedule_ID
ORDER BY s.schedule_ID;

Basically, this counts the schedule_ID from appointments only when it matches the schedule_ID in schedules table. The issue is that this query does not account for the case when a.schedule_ID is null from appointments table. When it is null, it returns no records.I want to account for null values and if it is null, then return 0.

After some research I found that something like this must work:

SELECT s.schedule_ID as "schedule_ID", schedule_start, schedule_end, spots, 
COUNT(CASE a.schedule_ID WHEN s.schedule_ID THEN TRUE WHEN NULL THEN '0' END) as "takenSpots" 
FROM schedules s, appointments a 
GROUP BY s.schedule_ID
ORDER BY s.schedule_ID;

But no luck and still no record is returned here even when I check for NULL value. Can someone please help me out?

I appreciate your help!


Solution

  • The reason your CASE exprssion is not working is because the "short form" you used can only test for equality which is never true for NULL values.

    So you would need to use:

    CASE WHEN a.schedule_ID = coalesce(s.schedule_ID, a.schedule_ID) then TRUE ELSE NULL END
    

    However you are not properly joining the two tables. from a,b is an implicit cross join creating a cartesian product between the two tables.

    It seems you are looking for an outer join, then count the number of matches:

    SELECT s.schedule_ID as "schedule_ID", schedule_start, schedule_end, spots, 
           count(a.schedule_id) "takenSpots" 
    FROM schedules s
      LEFT JOIN appointments a on s.schedule_id = a.schedule_id
    GROUP BY s.schedule_ID
    ORDER BY s.schedule_ID;
    

    The outer join will return all rows from the schedules table and only those that have match from the appointments table. For those rows in schedules that have no appointment, a.schedule_id will be null, and count() ignores those.


    Assuming all columns in the SELECT list (except a.schedule_id) are from the schedules table, there is slightly more efficient way of doing this: aggregate the appointments first, then do the outer join:

    SELECT s.schedule_id as "schedule_ID", schedule_start, schedule_end, spots, 
           coalesce(a.num_appointments,0) as "takenSpots" 
    FROM schedules s
      LEFT JOIN ( 
        select ap.schedule_id, count(*) as num_appointments
        from appointments  app
        group by ap.schedule_id
      ) a on s.schedule_id = a.schedule_id
    ORDER BY s.schedule_id;