Search code examples
sqlpostgresqlquery-performancesql-viewlateral-join

How to utilize an underlying INDEX when filtering a VIEW?


I have a very simple query that executed surprisingly slow and the reasons turned out that it was doing a scan on the view after doing a JOIN. I was a bit surprised to see that, as I was expecting Postgres to do the filtering before joining, seeing that one of the underlying tables had an INDEX on the column that was to be filtered.

Is there a way I can reorder the query somehow or hint to the planner how to do the query differently?

Mind you, I do know how to hack around the issue by directly accessing the underlying tables, but the view hides a bit of complexity it would be nice to keep out of the queries.

The query

select * from form where encounter_id= 23728 and type = 'vitals';

The EXPLAIN ANALYZE

Subquery Scan on form  (cost=0.57..3439.07 rows=1 width=622) (actual time=8.187..8.187 rows=0 loops=1)
  Filter: ((form.encounter_id = 23728) AND (form.type = 'vitals'::text))
  Rows Removed by Filter: 12000
  ->  Unique  (cost=0.57..3259.07 rows=12000 width=626) (actual time=0.008..7.612 rows=12000 loops=1)
        ->  Merge Join  (cost=0.57..3229.07 rows=12000 width=626) (actual time=0.007..5.485 rows=12000 loops=1)
              Merge Cond: (fd.form_id = f.id)
              ->  Index Scan using _idx_form_details on _form_details fd  (cost=0.29..2636.78 rows=12000 width=603) (actual time=0.003..1.918 rows=12000 loops=1)
              ->  Index Scan using pk_form on _form f  (cost=0.29..412.29 rows=12000 width=27) (actual time=0.002..1.214 rows=12000 loops=1)
Planning time: 0.170 ms
Execution time: 8.212 ms

The TABLE and VIEW definitions

CREATE TABLE _form (
  id INT NOT NULL,
  encounter_id INT REFERENCES _encounter (id)            NOT NULL,
  type         TEXT                                      NOT NULL,
  CONSTRAINT pk_form PRIMARY KEY (id),
  FOREIGN KEY (cid) REFERENCES _user_in_role (id)
);

CREATE INDEX encounter_id ON _form (encounter_id, type);

CREATE TABLE _form_details (
  id INT NOT NULL,
  form_id   INT REFERENCES _form (id) NOT NULL,
  archived  BOOLEAN                   NOT NULL DEFAULT FALSE,
  CONSTRAINT pk_form_details PRIMARY KEY (id),
  FOREIGN KEY (cid) REFERENCES _user_in_role (id)
);

CREATE VIEW form AS
  SELECT DISTINCT ON (f.id)
    f.id,
    f.encounter_id,
    f.type,
    fd.archived,
    f.cid
  FROM _form f 
       JOIN _form_details fd 
         ON (f.id = fd.form_id)
  ORDER BY f.id, fd.id DESC;

EDIT: Someone posted an answer (subsequently deleted) which contained an important tidbit of information: even though the encounter_id column in the underlying table is indexed, the ORDER BY operation in the VIEW defeats its purpose. Unfortunately, we cannot rid ourselves of ORDER BY as it is required for the DISTINCT ON to work.


Solution

  • @a_horse_with_no_name gave me the fastest solution so far, in the chat, but never supplied an answer. So for reference, here's his solution, using a lateral join to create the view.

    CREATE VIEW form AS
        SELECT f.id, 
            f.encounter_id, 
            f.type, 
            fd.archived, 
            f.cid 
        FROM _form f 
        JOIN LATERAL ( 
            SELECT form_id, archived 
            FROM _form_details _fd 
            WHERE _fd.form_id = f.id 
            ORDER BY _fd.id DESC 
            LIMIT 1 
        ) AS fd ON TRUE; 
    

    This is about 10x faster than any other solutions. If created as form4, based on the same tables as the ones from @wildplasser, here's how it performs:

    select * from form where encounter_id= 23728 and ztype = 'vitals' ;
    Time: 181.065 ms
    select * from form2 where encounter_id= 23728 and ztype = 'vitals' ;
    Time: 12.395 ms
    select * from form3 where encounter_id= 23728 and ztype = 'vitals' ;
    Time: 122.305 ms
    select * from form4 where encounter_id= 23728 and ztype = 'vitals' ;
    Time: 1.305 ms
    

    Some good pointers on lateral joins, introduced in Postgres 9.3: