Search code examples
sqlpostgresqlquery-optimizationpostgresql-performancelateral-join

Why is a LATERAL JOIN faster than a correlated subquery in Postgres?


I have rewritten the below query from using a correlated subquery to using a LATERAL JOIN. It is now faster, and I would like to understand why?

This SO answer says that there is no golden rule and that it really depends on the situation. However, I was wondering if there is some sort of intuition to this?

In case it is relevant, this query is used as a VIEW.

The old query with a correlated subquery (well technically 2):

SELECT
  ts.id,
  ts.val1,
  tv.val2
FROM table_static AS ts
JOIN table_version AS tv ON ts.id = tv.id
  AND tv.effective_time = (
    SELECT MAX(tv1.effective_time)
    FROM table_version AS tv1
    WHERE
      tv1.id = ts.id
      AND
      tv1.effective_time <= CLOCK_TIMESTAMP()
  )
  AND tv.create_time = (
    SELECT MAX(tv2.create_time)
    FROM table_version AS tv2
    WHERE
      tv2.id = tv.id
      AND
      tv2.effective_time = tv.effective_time
      AND
      tv2.create_time <= CLOCK_TIMESTAMP()
  )
JOIN table_status AS t_status ON tv.status_id = t_status.id
WHERE t_status.status != 'deleted'
LIMIT 1000;

Query plan (correlated subquery):

Limit  (cost=4.96..13876200.85 rows=141 width=64) (actual time=0.078..10.788 rows=1000 loops=1)
  ->  Nested Loop  (cost=4.96..13876200.85 rows=141 width=64) (actual time=0.077..10.641 rows=1000 loops=1)
        Join Filter: (tv.status_id = t_status.id)
        ->  Nested Loop  (cost=4.96..13876190.47 rows=176 width=64) (actual time=0.065..10.169 rows=1000 loops=1)
              ->  Seq Scan on table_static ts  (cost=0.00..17353.01 rows=1000001 width=32) (actual time=0.010..0.176 rows=1000 loops=1)
              ->  Index Scan using table_version_pkey on table_version tv  (cost=4.96..13.85 rows=1 width=40) (actual time=0.005..0.006 rows=1 loops=1000)
                    Index Cond: ((id = ts.id) AND (effective_time = (SubPlan 2)))
                    Filter: (create_time = (SubPlan 4))
                    SubPlan 4
                      ->  Result  (cost=8.46..8.47 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=1000)
                            InitPlan 3 (returns $4)
                              ->  Limit  (cost=0.43..8.46 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1000)
                                    ->  Index Only Scan Backward using table_version_pkey on table_version tv2  (cost=0.43..8.46 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1000)
                                          Index Cond: ((id = tv.id) AND (effective_time = tv.effective_time) AND (create_time IS NOT NULL))
                                          Filter: (create_time <= clock_timestamp())
                                          Heap Fetches: 0
                    SubPlan 2
                      ->  Result  (cost=4.52..4.53 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=1000)
                            InitPlan 1 (returns $1)
                              ->  Limit  (cost=0.43..4.52 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=1000)
                                    ->  Index Only Scan Backward using table_version_pkey on table_version tv1  (cost=0.43..8.61 rows=2 width=8) (actual time=0.002..0.002 rows=1 loops=1000)
                                          Index Cond: ((id = ts.id) AND (effective_time IS NOT NULL))
                                          Filter: (effective_time <= clock_timestamp())
                                          Heap Fetches: 0
        ->  Materialize  (cost=0.00..1.08 rows=4 width=8) (actual time=0.000..0.000 rows=1 loops=1000)
              ->  Seq Scan on table_status t_status  (cost=0.00..1.06 rows=4 width=8) (actual time=0.006..0.006 rows=1 loops=1)
                    Filter: (status <> 'deleted'::text)
Planning Time: 0.827 ms
Execution Time: 10.936 ms

The new query with a LATERAL JOIN:

SELECT
  ts.id,
  ts.val1,
  tv.val2
FROM table_static AS ts
JOIN LATERAL (
  SELECT *
  FROM table_version AS tv
  WHERE
    ts.id = tv.id
    AND
    tv.effective_time <= CLOCK_TIMESTAMP()
    AND
    tv.create_time <= CLOCK_TIMESTAMP()
  ORDER BY
    tv.effective_time DESC,
    tv.create_time DESC
  LIMIT 1
) AS tv ON TRUE
JOIN table_status AS t_status ON tv.status_id = t_status.id
WHERE t_status.status != 'deleted'
LIMIT 1000;

Query plan (LATERAL JOIN):

Limit  (cost=0.43..40694.36 rows=1000 width=64) (actual time=0.218..4.431 rows=1000 loops=1)
  ->  Nested Loop  (cost=0.43..32555183.83 rows=800001 width=64) (actual time=0.217..4.280 rows=1000 loops=1)
        Join Filter: (tv.status_id = t_status.id)
        ->  Nested Loop  (cost=0.43..32502382.70 rows=1000001 width=64) (actual time=0.189..3.815 rows=1000 loops=1)
              ->  Seq Scan on table_static ts  (cost=0.00..17353.01 rows=1000001 width=32) (actual time=0.059..0.297 rows=1000 loops=1)
              ->  Limit  (cost=0.43..32.46 rows=1 width=48) (actual time=0.003..0.003 rows=1 loops=1000)
                    ->  Index Scan Backward using table_version_pkey on table_version tv  (cost=0.43..32.46 rows=1 width=48) (actual time=0.003..0.003 rows=1 loops=1000)
                          Index Cond: (id = ts.id)
                          Filter: ((effective_time <= clock_timestamp()) AND (create_time <= clock_timestamp()))
        ->  Materialize  (cost=0.00..1.08 rows=4 width=8) (actual time=0.000..0.000 rows=1 loops=1000)
              ->  Seq Scan on table_status t_status  (cost=0.00..1.06 rows=4 width=8) (actual time=0.021..0.021 rows=1 loops=1)
                    Filter: (status <> 'deleted'::text)
Planning Time: 1.315 ms
Execution Time: 4.746 ms

For both cases the following indices (primary keys) exist:

ALTER TABLE ONLY table_static
    ADD CONSTRAINT table_static_pkey PRIMARY KEY (id);
ALTER TABLE ONLY table_version
    ADD CONSTRAINT table_version_pkey PRIMARY KEY (id, effective_time, create_time);
ALTER TABLE ONLY table_status
    ADD CONSTRAINT table_status_pkey PRIMARY KEY (id);

Maybe the answer is simply because there is one less "subquery"? To my understanding, the indices can be used by both queries.

If there are any other ways I might optimize this query, I would love to hear them.


Solution

  • For simple cases, a correlated subquery is typically slightly faster than a LATERAL subquery. LATERAL subqueries are just more versatile.

    JOIN LATERAL ... ON true is most probably not what you want. It's not equivalent to a correlated subquery in any case - which is LEFT JOIN LATERAL ... ON true. Your version is a more verbose equivalent of CROSS JOIN LATERAL, which eliminates result rows where the right side produces no row. Your two queries are not logically equivalent. Retest with equivalent queries. See:

    And why clock_timestamp()? Most probably wrong. We'd need to see the exact table definition and your rationale for that. clock_timestamp() changes during statement execution, and can make queries much more expensive - besides being wrong most of the time. See:

    Finally, your "correlated" version really uses two distinct correlated subqueries. The second one doubles the work without need, making it more expensive. Use a single correlated subquery instead.

    Apart from that, your index table_version_pkey is pretty perfect for the given query, and both LATERAL and correlated will be very fast.

    Assuming there is a proper 1:1 relationship between table_static and table_status, and a 1:n relationship between table_static and table_status. Compare these:

    -- correlated
    SELECT ts.id, ts.val1
        , (SELECT tv.val2
           FROM   table_version tv
           WHERE  ts.id = tv.id
           AND    tv.effective_time <= now()
           AND    tv.create_time    <= now()
           ORDER  BY tv.effective_time DESC, tv.create_time DESC
           LIMIT  1
          ) AS val2
    FROM   table_static ts
    JOIN   table_status t_status ON tv.status_id = t_status.id
    WHERE  t_status.status <> 'deleted'
    LIMIT  1000;
    
    -- lateral
    SELECT ts.id, ts.val1, tv.val2
    FROM   table_static ts
    JOIN   table_status t_status ON tv.status_id = t_status.id
    LEFT   JOIN LATERAL (
       SELECT tv.val2
       FROM   table_version tv
       WHERE  ts.id = tv.id
       AND    tv.effective_time <= now()
       AND    tv.create_time    <= now()
       ORDER  BY tv.effective_time DESC, tv.create_time DESC
       LIMIT  1
       ) tv ON true
    WHERE  t_status.status <> 'deleted'
    LIMIT  1000;
    

    Both should be very fast, a bit faster than your previous best test candidate, the correlated version the slightly faster one.


    One more issue: LIMIT 1000. Actually, two issues with that:

    1. Typically you display one page of rows, that would be 10, 20, 50, 100? But not 1000. Maybe you need just 1000, then disregard this point. You wrapped that into a VIEW. If you then select rows from that view with another (smaller) LIMIT, you get a sub-optimal query plan. Use the actual LIMIT (and OFFSET) you need, in a customized query. Remove the VIEW from the food chain, or remove the LIMIT from the VIEW. Postgres can typically find a different (faster) query plan for LIMIT 10. If you want paging, look into "keyset pagination":

    2. LIMIT without ORDER BY produces arbitrary results. Results can differ between two executions, even without changing anything else. Typically, you want deterministic results, defined by given criteria. Then optimize query (and indexes) for that. Recent related answer: