Search code examples
sqlpostgresqlleft-join

Join condition of a LEFT JOIN seems to be ignored for the left table?


My query seems to ignore the ON column = constant predicate of a LEFT JOIN for the left table. It does enforce ON column = constant for the right table, though.

If I move the left table's ON column = constant to the WHERE clause instead, the query works as intended.

Why does it matter, for the left table, whether I place the column = constant in the WHERE part of the query, or in the ON clause of the LEFT JOIN?

(What happens, is that the left table ON column = constant condition gets pushed up to a "JOIN Filter" step, where it is seemingly being ignored.)

Details:

EXPLAIN ANALYZE
select * from DW1_PAGE_PATHS t left join DW1_PAGES g
   on t.TENANT = g.TENANT
  and t.PAGE_ID = g.GUID
  and g.GUID = 'abcdefg'  -- works
  and t.CANONICAL = 'C'  -- "ignored", unless moved to `where` clause
where t.TENANT = '72'
  and PARENT_FOLDER like '/%';

Here (below) is the exec plan. Note that t.CANONICAL = 'C' has been pushed up to the "JOIN Filter" step, whereas the g.GUID = 'abcdefg' filter happens directly when the right table is being scanned.

 Nested Loop Left Join  (cost=... actual time=...)
   Join Filter: (((t.canonical)::text = 'C'::text)
             AND ((t.tenant)::text = (g.tenant)::text)
             AND ((t.page_id)::text = (g.guid)::text))
   ->  Seq Scan on dw1_page_paths t
         Filter: (((parent_folder)::text ~~ '/%'::text)
              AND ((tenant)::text = '72'::text))
   ->  Seq Scan on dw1_pages g
         Filter: (((tenant)::text = '72'::text)
              AND ((guid)::text = 'abcdefg'::text))

(Another question: Why won't the "Join Filter" with t.canonical = 'C' filter out rows for which canonical is not 'C'? It does not.)

PostgreSQL version: psql (9.1.6, server 9.1.1)

Here is a link to a similar query, but the answers don't explain why it works if you move the left table ON column = constant to a where clause instead:


Solution

  • The ON clause of a LEFT [OUTER] JOIN only regulates whether 0 or more row(s) from the right table are joined. (Possibly multiplying output rows.)

    It does not filter rows from the left table. If you want that, the expression has to go into a WHERE clause (as you found out already) or the ON clause of an [INNER] JOIN.

    Of course, a WHERE condition makes nonsense of a LEFT JOIN on the same table, as that's now forced to act as a plain JOIN.

    That's all by design. Standard SQL.