I am working with Postgres, and I have seen different behaviours for the same condition in a simple query vs a query with LEFT JOIN in it.
In Postgres, when I run this query:
SELECT *
FROM feed_common
LEFT JOIN user_message_query ON feed_common.id = user_message_query.feed_common_id
WHERE user_message_query.user_message_outbox_id IS NULL
AND feed_common.verification_status != 'spam'
I do not see the entries where feed_common.verification_status is NULL. To make it work, I had to run this query:
SELECT *
FROM feed_common
LEFT JOIN user_message_query ON feed_common.id = user_message_query.feed_common_id
WHERE user_message_query.user_message_outbox_id IS NULL
AND ( feed_common.verification_status != 'spam'
OR feed_common.verification_status IS NULL)
I understand that NULL values behave differently, and comparing them returns a NULL instead of a boolean. But, if that is the case, then why, in a simple query, am I able to see all the entries with verification_status = NULL as well?
In the same database, if I perform this query where I am filtering the rows with the same condition (and not specifically asking to include NULL values), I am getting the entries with verification_status = NULL as well:
SELECT *
FROM feed_common
WHERE feed_common.verification_status != 'spam'
Is this because of the "LEFT JOIN"? Does Postgres handle a little more complex queries like the ones with JOIN differently compared to simple queries?
EDIT: Apologies for the misunderstanding from my end. The queries were functioning as expected. The rows with the status = NULL were in fact not getting returned unless specified. I misread the returned rows. Again apologies for the oversight and thanks to everyone who participated in the discussion. Thanks to @laurenz for pointing out my mistake.
You claim that
if I perform this query where I am filtering the rows with the same condition (and not specifically asking to include NULL values), I am getting the entries with verification_status = NULL as well:
SELECT * FROM feed_common WHERE feed_common.verification_status != 'spam'
That cannot be true. This query would never return a row with verification_status
being NULL. It would return rows where the value is an empty string, which is something entirely different.
So your problem is unrelated to the left join, which is working properly. The problem is your wrong assumption about NULL matching =
and <>
comparisons.
A simpler way to write
feed_common.verification_status <> 'spam'
OR feed_common.verification_status IS NULL
would be
feed_common.verification_status IS DISTINCT FROM 'spam'
But both versions will make index searches difficult. A better design might be to make verification_status
be NOT NULL
and to introduce an extra status for what is now represented by NULL.