Search code examples
postgresqlperformancerestoredumpmaterialized-views

Search queries slow on Materialized Views of a restored dump as compared to the original DB


I have a psql DB containing various Materialized Views, on running a query, i.e., query_a we complete the query execution in 2800ms and re-running the same query again we get an execution time of 53ms. This can be explained by the caching done by psql. Now comes the tricky part, I create a dump of this db and restore it in NewDB, when I re-run query_a I get an execution time of 2253ms and on re-running get the same time, i.e., it seems that the psql caching is not working on the NewDB. I conducted various experiments to rectify the same and noticed that there is no improvement when I explicitly refresh the views but if I drop these views and re create it in my NewDB, it gives me the original performance. Note that the data is constant in DB and NewDB and I have used the commands mentioned here for dump creation and restore. The result for re running the query on DB is ->enter image description here

The results for running the same query on NewDB for 1st and 2nd time are as follows ->

                                                                         QUERY PLAN                                                                         
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=113790614477.61..113790614477.62 rows=1 width=8) (actual time=2284.605..2284.605 rows=1 loops=1)
   Buffers: shared hit=3540872
   CTE t
     ->  Merge Join  (cost=40600.92..11846650.56 rows=763041594 width=425) (actual time=3.693..1909.916 rows=6005 loops=1)
           Merge Cond: (n.node_id = nd.node_id)
           Buffers: shared hit=3524063
           ->  Index Scan using nodes_node_id on nodes n  (cost=0.43..350865.91 rows=3824099 width=389) (actual time=0.014..1651.025 rows=3598491 loops=1)
                 Buffers: shared hit=3523372
           ->  Sort  (cost=40600.49..40700.26 rows=39907 width=40) (actual time=3.668..4.227 rows=6005 loops=1)
                 Sort Key: nd.node_id
                 Sort Method: quicksort  Memory: 623kB
                 Buffers: shared hit=691
                 ->  Bitmap Heap Scan on nodes_depths nd  (cost=1153.11..37550.73 rows=39907 width=40) (actual time=0.627..2.846 rows=6005 loops=1)
                       Recheck Cond: ((ancestor_1 = 1) OR (ancestor_2 = 1))
                       Heap Blocks: exact=658
                       Buffers: shared hit=691
                       ->  BitmapOr  (cost=1153.11..1153.11 rows=40007 width=0) (actual time=0.547..0.547 rows=0 loops=1)
                             Buffers: shared hit=33
                             ->  Bitmap Index Scan on nodes_depths_1  (cost=0.00..566.58 rows=20003 width=0) (actual time=0.032..0.032 rows=156 loops=1)
                                   Index Cond: (ancestor_1 = 1)
                                   Buffers: shared hit=4
                             ->  Bitmap Index Scan on nodes_depths_2  (cost=0.00..566.58 rows=20003 width=0) (actual time=0.515..0.515 rows=5849 loops=1)
                                   Index Cond: (ancestor_2 = 1)
                                   Buffers: shared hit=29
   ->  Merge Right Join  (cost=169565733.26..97549168801.28 rows=6491839610305 width=0) (actual time=1915.721..2284.175 rows=6005 loops=1)
         Merge Cond: (nodes_fts.node_id = t.node_id)
         Buffers: shared hit=3540872
         ->  Index Only Scan using nodes_fts_idx on nodes_fts  (cost=0.43..97055.96 rows=1701569 width=4) (actual time=0.041..277.890 rows=1598712 loops=1)
               Heap Fetches: 1598712
               Buffers: shared hit=16805
         ->  Materialize  (cost=169565732.84..173380940.81 rows=763041594 width=4) (actual time=1915.675..1916.583 rows=6005 loops=1)
               Buffers: shared hit=3524067
               ->  Sort  (cost=169565732.84..171473336.82 rows=763041594 width=4) (actual time=1915.672..1916.057 rows=6005 loops=1)
                     Sort Key: t.node_id
                     Sort Method: quicksort  Memory: 474kB
                     Buffers: shared hit=3524067
                     ->  CTE Scan on t  (cost=0.00..15260831.88 rows=763041594 width=4) (actual time=3.698..1914.771 rows=6005 loops=1)
                           Buffers: shared hit=3524063
 Planning time: 68.064 ms
 Execution time: 2285.084 ms
(40 rows)

and for the second run ->

                                                                             QUERY PLAN                                                                         
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=113790614477.61..113790614477.62 rows=1 width=8) (actual time=2295.319..2295.319 rows=1 loops=1)
   Buffers: shared hit=3540868
   CTE t
     ->  Merge Join  (cost=40600.92..11846650.56 rows=763041594 width=425) (actual time=15.324..1926.744 rows=6005 loops=1)
           Merge Cond: (n.node_id = nd.node_id)
           Buffers: shared hit=3524063
           ->  Index Scan using nodes_node_id on nodes n  (cost=0.43..350865.91 rows=3824099 width=389) (actual time=0.027..1648.277 rows=3598491 loops=1)
                 Buffers: shared hit=3523372
           ->  Sort  (cost=40600.49..40700.26 rows=39907 width=40) (actual time=15.254..15.903 rows=6005 loops=1)
                 Sort Key: nd.node_id
                 Sort Method: quicksort  Memory: 623kB
                 Buffers: shared hit=691
                 ->  Bitmap Heap Scan on nodes_depths nd  (cost=1153.11..37550.73 rows=39907 width=40) (actual time=3.076..10.752 rows=6005 loops=1)
                       Recheck Cond: ((ancestor_1 = 1) OR (ancestor_2 = 1))
                       Heap Blocks: exact=658
                       Buffers: shared hit=691
                       ->  BitmapOr  (cost=1153.11..1153.11 rows=40007 width=0) (actual time=2.524..2.525 rows=0 loops=1)
                             Buffers: shared hit=33
                             ->  Bitmap Index Scan on nodes_depths_1  (cost=0.00..566.58 rows=20003 width=0) (actual time=0.088..0.088 rows=156 loops=1)
                                   Index Cond: (ancestor_1 = 1)
                                   Buffers: shared hit=4
                             ->  Bitmap Index Scan on nodes_depths_2  (cost=0.00..566.58 rows=20003 width=0) (actual time=2.434..2.435 rows=5849 loops=1)
                                   Index Cond: (ancestor_2 = 1)
                                   Buffers: shared hit=29
   ->  Merge Right Join  (cost=169565733.26..97549168801.28 rows=6491839610305 width=0) (actual time=1933.113..2294.894 rows=6005 loops=1)
         Merge Cond: (nodes_fts.node_id = t.node_id)
         Buffers: shared hit=3540868
         ->  Index Only Scan using nodes_fts_idx on nodes_fts  (cost=0.43..97055.96 rows=1701569 width=4) (actual time=0.077..271.313 rows=1598712 loops=1)
               Heap Fetches: 1598712
               Buffers: shared hit=16805
         ->  Materialize  (cost=169565732.84..173380940.81 rows=763041594 width=4) (actual time=1933.030..1933.903 rows=6005 loops=1)
               Buffers: shared hit=3524063
               ->  Sort  (cost=169565732.84..171473336.82 rows=763041594 width=4) (actual time=1933.026..1933.375 rows=6005 loops=1)
                     Sort Key: t.node_id
                     Sort Method: quicksort  Memory: 474kB
                     Buffers: shared hit=3524063
                     ->  CTE Scan on t  (cost=0.00..15260831.88 rows=763041594 width=4) (actual time=15.336..1932.145 rows=6005 loops=1)
                           Buffers: shared hit=3524063
 Planning time: 1.154 ms
 Execution time: 2295.801 ms
(40 rows)

Solution

  • The estimated number of rows is off from the actual numbers by orders of magnitude:

    CTE Scan on t (cost=0.00..15260831.88 rows=763041594 width=4) (actual time=15.336..1932.145 rows=6005 loops=1)

    When Postgres can't make accurate estimates of how much work a particular way of executing your query is compared to another it will generate inefficient query plans and that is why the same query can be slow even if all the data is in RAM.

    When you backup a table the dump does not contain the statistics used by the optimizer so you need to wait for the autovacuum daemon or run 'ANALYZE ' manually after restoring from the dump.