Search code examples
postgresqlperformancegreatest-n-per-grouppostgresql-performancedistinct-on

Why does distinct on in subqueries hurt performance in PostgreSQL?


I've got a table users with fields id and email. id is the primary key and email is indexed as well.

database> \d users
+-----------------------------+-----------------------------+-----------------------------------------------------+
| Column                      | Type                        | Modifiers                                           |
|-----------------------------+-----------------------------+-----------------------------------------------------|
| id                          | integer                     |  not null default nextval('users_id_seq'::regclass) |
| email                       | character varying           |                                                     |
+-----------------------------+-----------------------------+-----------------------------------------------------+
Indexes:
    "users_pkey" PRIMARY KEY, btree (id)
    "index_users_on_email" UNIQUE, btree (email)

If I query the table with a distinct on (email) clause in a subquery I get a significant performance penalty.

database> explain (analyze, buffers)
   select
     id
   from (
     select distinct on (email)
       id
     from
       users
   ) as t
   where id = 123
+-----------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                  |
|-----------------------------------------------------------------------------------------------------------------------------|
| Subquery Scan on t  (cost=8898.69..10077.84 rows=337 width=4) (actual time=221.133..250.782 rows=1 loops=1)                 |
|   Filter: (t.id = 123)                                                                                                      |
|   Rows Removed by Filter: 67379                                                                                             |
|   Buffers: shared hit=2824, temp read=288 written=289                                                                       |
|   ->  Unique  (cost=8898.69..9235.59 rows=67380 width=24) (actual time=221.121..247.582 rows=67380 loops=1)                 |
|         Buffers: shared hit=2824, temp read=288 written=289                                                                 |
|         ->  Sort  (cost=8898.69..9067.14 rows=67380 width=24) (actual time=221.120..239.573 rows=67380 loops=1)             |
|               Sort Key: users.email                                                                                         |
|               Sort Method: external merge  Disk: 2304kB                                                                     |
|               Buffers: shared hit=2824, temp read=288 written=289                                                           |
|               ->  Seq Scan on users  (cost=0.00..3494.80 rows=67380 width=24) (actual time=0.009..9.714 rows=67380 loops=1) |
|                     Buffers: shared hit=2821                                                                                |
| Planning Time: 0.243 ms                                                                                                     |
| Execution Time: 251.258 ms                                                                                                  |
+-----------------------------------------------------------------------------------------------------------------------------+

Compare this with distinct on (id) whose cost is less than one thousandth of the previous query.

database> explain (analyze, buffers)
   select
     id
   from (
     select distinct on (id)
       id
     from
       users
   ) as t
   where id = 123
+-----------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                  |
|-----------------------------------------------------------------------------------------------------------------------------|
| Unique  (cost=0.29..8.31 rows=1 width=4) (actual time=0.021..0.022 rows=1 loops=1)                                          |
|   Buffers: shared hit=3                                                                                                     |
|   ->  Index Only Scan using users_pkey on users  (cost=0.29..8.31 rows=1 width=4) (actual time=0.020..0.020 rows=1 loops=1) |
|         Index Cond: (id = 123)                                                                                              |
|         Heap Fetches: 1                                                                                                     |
|         Buffers: shared hit=3                                                                                               |
| Planning Time: 0.090 ms                                                                                                     |
| Execution Time: 0.034 ms                                                                                                    |
+-----------------------------------------------------------------------------------------------------------------------------+

Why is this?

The real problem I'm having is that I'm trying to create a view that does distinct on an indexed column that isn't unique and the performance is very bad.


Solution

  • Logical difference

    Both columns id and email are UNIQUE. But only id is NOT NULL. (PRIMARY KEY columns always are.) NULL values are not considered equal, multiple NULL values are allowed in a column with UNIQUE constraint (or index). That's according to standard SQL. See:

    But DISTINCT or DISTINCT ON consider NULL values equal. The manual:

    Obviously, two rows are considered distinct if they differ in at least one column value. Null values are considered equal in this comparison.

    Bold emphasis mine. Further reading:

    In your second query, distinct on (id) is a logical no-op: the result is guaranteed to be the same as without DISTINCT ON. And since the outer SELECT filters on id = 123, Postgres can strip the noise and work off a very cheap index-only scan.

    In your first query, on the other hand, distinct on (email) might actually do something if there are more than one rows with email IS NULL. Then Postgres has to pick the first id according to the given sort order. Since there is no ORDER BY, it results in an arbitrary pick. But the outer SELECT with the predicate where id = 123 may depend on the result. The whole query is different from the first on principle - and broken by design.

    Serendipity

    All that aside, two "lucky" findings stick out:

    Sort Method: external merge  Disk: 2304kB
    

    Mention of "Disk" indicates insufficient work_mem. See:

              ->  Seq Scan on users  (cost=0.00..3494.80 rows=67380 
    

    In my tests, I always got an index scan here. Indicates a bloated index or some other problem with your setup.

    Useful comparison?

    The comparison is going nowhere. We might learn something from comparing the first query with this query - after switching roles of PK and UNIQUE column:

    select email
    from  (select distinct on (id) email from users) t
    where email = '[email protected]';
    

    Or from comparing the second query with this one - trying the same with the UNIQUE column instead of the PK column:

    select email
    from  (select distinct on (email) email from users) t
    where email = '[email protected]';
    

    We learn that PK and UNIQUE constraint show no different effect on query plans. Postgres does not use the meta-information to cut corners. The PK would actually make a difference with GROUP BY. See:

    So this works:

    SELECT email
    FROM  (
       SELECT email -- no aggregate required, because id = PK
       FROM   users
       GROUP  BY id  -- !
       ) t
    WHERE email = '[email protected]';
    

    But the same does not work after switching id and email. I added some demos to the fiddle:

    db<>fiddle here

    So?

    Both queries are nonsense for different reasons. I don't see how they can help with your real query:

    The real problem I'm having is that I'm trying to create a view that does distinct on an indexed column that isn't unique and the performance is very bad.

    We'd need to see your real query - and all other relevant details of your setup. There may be solutions, but that may well beyond the scope of a question on SO. Consider hiring a consultant. Or consider one of these methods to optimize performance: