Search code examples
sqlpostgresqlquery-optimization

Postgresql Query slow if empty table in IN clause


I have the following SQL

WITH filtered_users_pre as (
  SELECT value as username,row_number() OVER (partition by value) AS rk
    FROM "user-stats".tag_table
    WHERE _at_timestamp = 1626955200
       AND tag in ('commercial','marketing')
  ),

  filtered_users as (
    SELECT username
    FROM filtered_users_pre
    WHERE rk = 2
  ),

  valid_users as (
    SELECT aa.username, aa.rank, aa.points, aa.version
    FROM "users-results".ai_algo aa
    WHERE aa._at_timestamp = 1626955200
          AND aa.rank_timeframe = '7d'
          AND aa.username IN (SELECT * FROM filtered_users)
    ORDER BY aa.rank ASC
    LIMIT 15
    OFFSET 0
  )
select * from valid_users;

"user-stats".tag_table is a table with around 60 million rows, with proper indexes. "users-results".ai_algo is a table with around 10 million rows, with proper indexes.

With proper indexes I mean all the fields that appear in a WHERE clause above.

If filtered_users is empty, the query takes 4 seconds to run. If filtered_users has at least one row, it takes 400ms.

Anyone can explain me why? Is there I way I can have the query running with the same performance (400ms) also with filtered_users empty? I was expecting to get better performance with the reducing of number of rows in filtered_users. That's what happens up to 1 row. When the rows are 0, it takes 10 times more.

Of couse same happens if instead of IN clause in the WHERE, I put a INNER JOIN between ai_algo and filtered_users

Update This is the EXPLAIN (ANALYZE, BUFFERS) output query when filtered_users has 0 rows (4 secs of execution)

Limit  (cost=14592.13..15870.39 rows=15 width=35) (actual time=3953.945..3953.949 rows=0 loops=1)
  Buffers: shared hit=7456641
  ->  Nested Loop Semi Join  (cost=14592.13..1795382.62 rows=20897 width=35) (actual time=3953.944..3953.947 rows=0 loops=1)
        Join Filter: (aa.username = filtered_users_pre.username)
        Buffers: shared hit=7456641
        ->  Index Scan using ai_algo_202107_rank_timeframe_rank_idx on ai_algo_202107 aa  (cost=0.56..1718018.61 rows=321495 width=35) (actual time=0.085..3885.547 rows=313611 loops=1)
"              Index Cond: (rank_timeframe = '7d'::""valid-users-timeframe"")"
              Filter: (_at_timestamp = 1626955200)
              Rows Removed by Filter: 7793096
              Buffers: shared hit=7456533
        ->  Materialize  (cost=14591.56..14672.51 rows=13 width=21) (actual time=0.000..0.000 rows=0 loops=313611)
              Buffers: shared hit=108
              ->  Subquery Scan on filtered_users_pre  (cost=14591.56..14672.44 rows=13 width=21) (actual time=3.543..3.545 rows=0 loops=1)
                    Filter: (filtered_users_pre.rk = 2)
                    Rows Removed by Filter: 2415
                    Buffers: shared hit=108
                    ->  WindowAgg  (cost=14591.56..14638.74 rows=2696 width=29) (actual time=1.996..3.356 rows=2415 loops=1)
                          Buffers: shared hit=108
                          ->  Sort  (cost=14591.56..14598.30 rows=2696 width=21) (actual time=1.990..2.189 rows=2415 loops=1)
                                Sort Key: tag_table_20210722.value
                                Sort Method: quicksort  Memory: 285kB
                                Buffers: shared hit=108
                                ->  Bitmap Heap Scan on tag_table_20210722  (cost=146.24..14437.94 rows=2696 width=21) (actual time=0.612..1.080 rows=2415 loops=1)
"                                      Recheck Cond: ((tag)::text = ANY ('{commercial,marketing}'::text[]))"
                                      Filter: (_at_timestamp = 1626955200)
                                      Rows Removed by Filter: 2415
                                      Heap Blocks: exact=72
                                      Buffers: shared hit=105
                                      ->  Bitmap Index Scan on tag_table_20210722_tag_idx  (cost=0.00..145.57 rows=5428 width=0) (actual time=0.292..0.292 rows=4830 loops=1)
"                                            Index Cond: ((tag)::text = ANY ('{commercial,marketing}'::text[]))"
                                            Buffers: shared hit=33
Planning Time: 0.914 ms
Execution Time: 3954.035 ms

This is when filtered_users has at least 1 row (300ms)

Limit  (cost=14592.13..15870.39 rows=15 width=35) (actual time=15.958..300.759 rows=15 loops=1)
  Buffers: shared hit=11042
  ->  Nested Loop Semi Join  (cost=14592.13..1795382.62 rows=20897 width=35) (actual time=15.957..300.752 rows=15 loops=1)
        Join Filter: (aa.username = filtered_users_pre.username)
        Rows Removed by Join Filter: 1544611
        Buffers: shared hit=11042
        ->  Index Scan using ai_algo_202107_rank_timeframe_rank_idx on ai_algo_202107 aa (cost=0.56..1718018.61 rows=321495 width=35) (actual time=0.075..10.455 rows=645 loops=1)
"              Index Cond: (rank_timeframe = '7d'::""valid-users-timeframe"")"
              Filter: (_at_timestamp = 1626955200)
              Rows Removed by Filter: 16124
              Buffers: shared hit=10937
        ->  Materialize  (cost=14591.56..14672.51 rows=13 width=21) (actual time=0.003..0.174 rows=2395 loops=645)
              Buffers: shared hit=105
              ->  Subquery Scan on filtered_users_pre  (cost=14591.56..14672.44 rows=13 width=21) (actual time=1.895..3.680 rows=2415 loops=1)
                    Filter: (filtered_users_pre.rk = 1)
                    Buffers: shared hit=105
                    ->  WindowAgg  (cost=14591.56..14638.74 rows=2696 width=29) (actual time=1.894..3.334 rows=2415 loops=1)
                          Buffers: shared hit=105
                          ->  Sort  (cost=14591.56..14598.30 rows=2696 width=21) (actual time=1.889..2.102 rows=2415 loops=1)
                                Sort Key: tag_table_20210722.value
                                Sort Method: quicksort  Memory: 285kB
                                Buffers: shared hit=105
                                ->  Bitmap Heap Scan on tag_table_20210722  (cost=146.24..14437.94 rows=2696 width=21) (actual time=0.604..1.046 rows=2415 loops=1)
"                                      Recheck Cond: ((tag)::text = ANY ('{commercial,marketing}'::text[]))"
                                      Filter: (_at_timestamp = 1626955200)
                                      Rows Removed by Filter: 2415
                                      Heap Blocks: exact=72
                                      Buffers: shared hit=105
                                      ->  Bitmap Index Scan on tag_table_20210722_tag_idx  (cost=0.00..145.57 rows=5428 width=0) (actual time=0.287..0.287 rows=4830 loops=1)
"                                            Index Cond: ((tag)::text = ANY ('{commercial,marketing}'::text[]))"
                                            Buffers: shared hit=33
Planning Time: 0.310 ms
Execution Time: 300.954 ms

Solution

  • The problem is that if there are no matching filtered_users, PostgreSQL has to go through all "users-results".ai_algo without finding 15 result rows. If the subquery contains elements, it quickly finds 15 matching "users-results".ai_algo rows and can terminate processing.

    There is nothing you can do about that, but you can speed up the scan of "users-results".ai_algo. Currently, you have

    ->  Index Scan using ai_algo_202107_rank_timeframe_rank_idx on ai_algo_202107 aa
                                  ... (actual time=0.085..3885.547 rows=313611 loops=1)
          Index Cond: (rank_timeframe = '7d'::"valid-users-timeframe")
          Filter: (_at_timestamp = 1626955200)
          Rows Removed by Filter: 7793096
          Buffers: shared hit=7456533
    

    You see that the index scan is not as effective as it could be: it reads 313611 + 7793096 = 8106707 rows from the table and discards all but the 313611 that match the filter condition.

    You can do better by creating an index that can find only the result rows directly:

    CREATE INDEX ON "users-results".ai_algo (rank_timeframe, _at_timestamp);
    

    Then you can drop the index ai_algo_rank_timeframe_rank_idx, because the new index can do everything that the old one could do (and more).