Search code examples
sqlpostgresql

In a join, fields with null values are not getting shown but in normal query they are


I am working with Postgres, and I have seen different behaviours for the same condition in a simple query vs a query with LEFT JOIN in it.

In Postgres, when I run this query:

SELECT *
  FROM feed_common
  LEFT JOIN user_message_query ON feed_common.id = user_message_query.feed_common_id
  WHERE user_message_query.user_message_outbox_id IS NULL
    AND feed_common.verification_status != 'spam'

I do not see the entries where feed_common.verification_status is NULL. To make it work, I had to run this query:

SELECT * 
  FROM feed_common 
  LEFT JOIN user_message_query ON feed_common.id = user_message_query.feed_common_id 
  WHERE user_message_query.user_message_outbox_id IS NULL 
    AND (   feed_common.verification_status != 'spam'
         OR feed_common.verification_status IS NULL)

I understand that NULL values behave differently, and comparing them returns a NULL instead of a boolean. But, if that is the case, then why, in a simple query, am I able to see all the entries with verification_status = NULL as well?

In the same database, if I perform this query where I am filtering the rows with the same condition (and not specifically asking to include NULL values), I am getting the entries with verification_status = NULL as well:

SELECT *
  FROM feed_common
  WHERE feed_common.verification_status != 'spam'

Is this because of the "LEFT JOIN"? Does Postgres handle a little more complex queries like the ones with JOIN differently compared to simple queries?

EDIT: Apologies for the misunderstanding from my end. The queries were functioning as expected. The rows with the status = NULL were in fact not getting returned unless specified. I misread the returned rows. Again apologies for the oversight and thanks to everyone who participated in the discussion. Thanks to @laurenz for pointing out my mistake.


Solution

  • You claim that

    if I perform this query where I am filtering the rows with the same condition (and not specifically asking to include NULL values), I am getting the entries with verification_status = NULL as well:

    SELECT *
      FROM feed_common
      WHERE feed_common.verification_status != 'spam'
    

    That cannot be true. This query would never return a row with verification_status being NULL. It would return rows where the value is an empty string, which is something entirely different.

    So your problem is unrelated to the left join, which is working properly. The problem is your wrong assumption about NULL matching = and <> comparisons.

    A simpler way to write

       feed_common.verification_status <> 'spam'
    OR feed_common.verification_status IS NULL
    

    would be

    feed_common.verification_status IS DISTINCT FROM 'spam'
    

    But both versions will make index searches difficult. A better design might be to make verification_status be NOT NULL and to introduce an extra status for what is now represented by NULL.