I have a query that returns 0 rows, but ends up - in one specific machine - taking around 10 seconds to complete. In every other machine i tested with the same database (although different postgres versions) and the query is almost instant.
Also, the results i get by doing "explain (analyze)" on the query are sort of the same in the "fast" machines, but a bit different in the problematic one, it seems that it never does a parallel seq scan, and all the other machines do, even though the parallel worker configurations are all the same in every machine. Results (all same database, PC#1 is the problematic one):
PC#1, PostgreSQL 11.5
Seq Scan on cc_parc (cost=0.00..1175721.63 rows=58 width=853) (actual time=9537.152..9537.152 rows=0 loops=1)
Filter: ((class = 'db1.C_Parcela'::text) AND (fact = 221327))
Rows Removed by Filter: 217422
Buffers: shared hit=8034 read=1163008
Planning Time: 0.448 ms
Execution Time: 9537.602 ms
PC#2, PostgreSQL 12.1
Gather (cost=1000.00..11684.49 rows=6 width=377) (actual time=136.784..139.779 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=344 read=8981
-> Parallel Seq Scan on cc_parc (cost=0.00..10683.89 rows=2 width=377) (actual time=30.167..30.167 rows=0 loops=3)
Filter: ((class = 'db1.C_Parcela'::text) AND (fact = 221327))
Rows Removed by Filter: 72474
Buffers: shared hit=344 read=8981
Planning Time: 1.945 ms
Execution Time: 139.860 ms
PC#3, PostgreSQL 13.2
Gather (cost=1000.00..11684.39 rows=5 width=1495) (actual time=70.610..88.546 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=212 read=9113
-> Parallel Seq Scan on cc_parc (cost=0.00..10683.89 rows=2 width=1495) (actual time=43.353..43.353 rows=0 loops=3) Filter: ((class = 'db1.C_Parcela'::text) AND (fact = 221327))
Rows Removed by Filter: 72474
Buffers: shared hit=212 read=9113
Planning:
Buffers: shared hit=311 read=6
Planning Time: 1.047 ms
Execution Time: 88.604 ms
PC#4, PostgreSQL 10.5
Gather (cost=1000.00..11685.63 rows=6 width=377) (actual time=98.253..121.924 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=2 read=9324
-> Parallel Seq Scan on cc_parc (cost=0.00..10685.03 rows=2 width=377) (actual time=53.534..53.535 rows=0 loops=3)
Filter: ((class = 'db1.C_Parcela'::text) AND (fact = 221327))
Rows Removed by Filter: 72474
Buffers: shared hit=2 read=9324
Planning time: 7.315 ms
Execution time: 122.090 ms
Also the number of filtered rows are different, the specific query ( select * from cc_parc where class='db1.C_Parcela' and fact=221327;
) returns 0 rows, its count value is 217422, but in the different machines the number of filteres rows is different,
PC#1 - Rows Removed by Filter: 217422
All other PCs - Rows Removed by Filter: 72474
I have also tried running all of the same tests in the same machine but with PostgreSQL 12 and the situation is solved, the values are the expected ones again:
PC#1, PostgreSQL 12.2
Gather (cost=1000.00..11684.59 rows=7 width=377) (actual time=94.210..99.924 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=370 read=8955
-> Parallel Seq Scan on cc_parc (cost=0.00..10683.89 rows=3 width=377) (actual time=51.816..51.816 rows=0 loops=3)
Filter: ((class = 'db1.C_Parcela'::text) AND (fact = 221327))
Rows Removed by Filter: 72474
Buffers: shared hit=370 read=8955
Planning Time: 4.815 ms
Execution Time: 100.059 ms
Am i safe to assume there was some sort of bug in a PostgreSQL 11 version that caused this, or does anyone have any other idea?
I noticed that after doing a dump and restore in the same psql version the issue resolved itself.
After analysing the output of select * from pgstattuple('public.cc_parc');
and comparing the result in a "bugged" database VS a correctly functioning one, I noticed that the table_length in the bugged db was 125x bigger... which is certainly not supposed to be the fact.
After running a VACUUM FULL
on the table, table_length was restored to its correct value and the slowness issue completely solved.
Running a full vacuum does not seem to be advisable, so I will try to find what bloated the database like this. Apparently most of the suggestions in here seem to be appropriate