Search code examples
sqlpostgresqlperformancequery-optimization

Slow running PostgreSQL query


I am currently trying to migrate a system to postgres and I am unfortunately not able to understand why a specific query is running so incredibly slow. Both in Transact-SQL and Oracle the same query runs in under 200ms. First things first though, I have a big table with 14.000.000 entries currently which only gets bigger and bigger. Furthermore the table itself has 54 columns meaning that we have quite the sum of data.

The table has a rather straight forward structure like this:

CREATE TABLE logtable (
    key varchar(20) NOT NULL,
    column1 int4 NULL,
    entrytype int4 NULL,
    column2 int4 NULL,
    column3 int4 NULL,
    column4 int4 NULL,
    column5 int4 NULL,
    column6 int4 NULL,
    column7 varchar(128) NULL,
    column8 varchar(2048) NULL,
    column9 varchar(2048) NULL,
    column10 varchar(2048) NULL,
    ...
    timestampcol timestamp NULL,
    column48 timestamp NULL,
    column49 timestamp NULL,
    column50 timestamp NULL,
    column51 timestamp NULL,
    column52 int4 NULL,
    column53 int4 NULL,
    column54 varchar(20) NULL,
    CONSTRAINT key PRIMARY KEY (key)
);

We also have a few predefined indexes:

CREATE INDEX idx1 ON logtable USING btree (id);
CREATE INDEX idx2 ON logtable  USING btree (archiveinterval);
CREATE INDEX idx3 ON logtable  USING btree (archivestatus);
CREATE INDEX idx4 ON logtable  USING btree (entrytype);
CREATE INDEX idx5 ON logtable  USING btree (column34);
CREATE INDEX idx6 ON logtable  USING btree (timestampcol);

Now the actual query that I perform is the following:

SELECT column1,..,column54
FROM logtable
where ((entrytype = 4000 or entrytype = 4001 or entrytype = 4002) and (archivestatus <= 1))
order by timestampcol desc;

This results in roughly 500K selected items. When establishing the connection, i also pass defaultRowFetchSize=5000 so the resultset doesn't try to get the full result set. As mentionend before, the same query takes about 200 ms in Oracle and MSSQL. Which leaves me wondering, what is exactly going on here. When I add a LIMIT 100, it reduces the query performance to 100 ms. Now I've already set these variables higher since I've seen these in multiple forum threads:

maintenance_work_mem    1GB
shared_buffers  2GB

I've also tried understanding the explain analyze resulting from the query. As I see it, it takes about 49 s just trying to bitmap heap scan.

Gather Merge  (cost=459158.89..507278.61 rows=412426 width=2532) (actual time=57323.536..59044.943 rows=514825 loops=1)
  Output: key, column2 ... column54
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=1411 read=292867
  ->  Sort  (cost=458158.86..458674.40 rows=206213 width=2532) (actual time=57243.386..57458.979 rows=171608 loops=3)
        Output: key, column2 ... column54
        Sort Key: logtable.timestampcol DESC
        Sort Method: quicksort  Memory: 60266kB
        Worker 0:  Sort Method: quicksort  Memory: 57572kB
        Worker 1:  Sort Method: quicksort  Memory: 57878kB
        Buffers: shared hit=1411 read=292867
        Worker 0: actual time=57218.621..57449.331 rows=168159 loops=1
          Buffers: shared hit=470 read=94622
        Worker 1: actual time=57192.076..57423.333 rows=169151 loops=1
          Buffers: shared hit=461 read=95862
        ->  Parallel Bitmap Heap Scan on logtable (cost=9332.66..439956.67 rows=206213 width=2532) (actual time=1465.971..56452.327 rows=171608 loops=3)
              Output: key, column2 ... column54
              Recheck Cond: ((logtable.entrytype = 4000) OR (logtable.entrytype = 4001) OR (logtable.entrytype = 4002))
              Filter: ((logtable.entrytype = 4000) OR (logtable.entrytype = 4001) OR ((logtable.entrytype = 4002) AND (logtable.archivestatus <= 1)))
              Heap Blocks: exact=101535
              Buffers: shared hit=1397 read=292867
              Worker 0: actual time=1440.278..56413.158 rows=168159 loops=1
                Buffers: shared hit=463 read=94622
              Worker 1: actual time=1416.245..56412.907 rows=169151 loops=1
                Buffers: shared hit=454 read=95862
              ->  BitmapOr  (cost=9332.66..9332.66 rows=500289 width=0) (actual time=1358.696..1358.697 rows=0 loops=1)
                    Buffers: shared hit=6 read=1322
                    ->  Bitmap Index Scan on idx4(entrytype)  (cost=0.00..1183.80 rows=66049 width=0) (actual time=219.270..219.271 rows=65970 loops=1)
                          Index Cond: (logtable.entrytype = 4000)
                          Buffers: shared hit=1 read=171
                    ->  Bitmap Index Scan on idx4(entrytype)  (cost=0.00..3792.43 rows=211733 width=0) (actual time=691.854..691.855 rows=224437 loops=1)
                          Index Cond: (logtable.entrytype = 4001)
                          Buffers: shared hit=2 read=576
                    ->  Bitmap Index Scan on idx4(entrytype)  (cost=0.00..3985.24 rows=222507 width=0) (actual time=447.558..447.558 rows=224418 loops=1)
                          Index Cond: (logtable.entrytype = 4002)
                          Buffers: shared hit=3 read=575
Planning Time: 0.562 ms
Execution Time: 59503.154 ms

When I do the same query WITHOUT the Order by, the query finishes in about 1.6 s which seems reasonable enough. When I take away the where clause, the query finishes in 86 ms which is due to my idx6.

I am kinda out of ideas. I've tried multiple indexes. Some composite indexes like = (entrytype, archivestatus, timestampcol) in different orders and with DESC. Is there something else I could try?

UPDATE:

Since a few of you asked, here i the query execution plan for Oracle. As I said, the literal same statement witht he same indexes runs in 0.2 seconds in oracle whereas it needs about 30-50 s in postgres.

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                         |  6878 |  2491K|  2147   (1)| 00:00:01 |
|   1 |  SORT ORDER BY                         |                         |  6878 |  2491K|  2147   (1)| 00:00:01 |
|   2 |   CONCATENATION                        |                         |       |       |            |          |
|*  3 |    TABLE ACCESS BY INDEX ROWID BATCHED | logtable                |   712 |   257K|   168   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                   | entrytype               |   712 |       |     5   (0)| 00:00:01 |
|   5 |    INLIST ITERATOR                     |                         |       |       |            |          |
|*  6 |     TABLE ACCESS BY INDEX ROWID BATCHED| logtable                |  6166 |  2233K|  1433   (1)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN                  | idx_entrytype           |  6166 |       |    22   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------

As someone mentionend, I had tried set enable_bitmapscan to off before but it didn't quite help. It had an impact, making the query faster, but it didn't exactly help to the point where I would consider using it.

Gather Merge  (cost=543407.97..593902.72 rows=432782 width=2538) (actual time=26207.686..27543.386 rows=515559 loops=1)
  Output: column1 ... column54
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=258390 read=147694 dirtied=3 written=1
  ->  Sort  (cost=542407.94..542948.92 rows=216391 width=2538) (actual time=26135.793..26300.677 rows=171853 loops=3)
        Output: column1 ... column54
        Sort Key: logtable.timestampcol DESC
        Sort Method: quicksort  Memory: 61166kB
        Worker 0:  Sort Method: quicksort  Memory: 56976kB
        Worker 1:  Sort Method: quicksort  Memory: 57770kB
        Buffers: shared hit=258390 read=147694 dirtied=3 written=1
        Worker 0: actual time=26100.640..26257.665 rows=166629 loops=1
          Buffers: shared hit=83315 read=48585 dirtied=2
        Worker 1: actual time=26102.323..26290.745 rows=169509 loops=1
          Buffers: shared hit=84831 read=48779
        ->  Parallel Seq Scan on logtable  (cost=0.00..523232.15 rows=216391 width=2538) (actual time=3.752..25627.657 rows=171853 loops=3)
              Output: column1 ... column54
              Filter: ((logtable.entrytype = 4000) OR (logtable.entrytype = 4001) OR ((logtable.entrytype = 4002) AND (logtable.archivestatus <= 1)))
              Rows Removed by Filter: 4521112
              Buffers: shared hit=258294 read=147694 dirtied=3 written=1
              Worker 0: actual time=1.968..25599.701 rows=166629 loops=1
                Buffers: shared hit=83267 read=48585 dirtied=2
              Worker 1: actual time=3.103..25604.552 rows=169509 loops=1
                Buffers: shared hit=84783 read=48779
Planning Time: 0.816 ms
Execution Time: 27914.204 ms

Just to clarify, my hope is that there is some kind of configuration, index or something else I've missed to put. Since we have a generic mechanism creating this query, it would be quite ugly to implement a database specific query JUST for this specific table. It's our biggest table by far containing log entries from throughout the system. (I don't exactly like this design but it is what it is) There must be a reason why Postgres in particular, is that much slower compared to other databases when handling big data.

As multiple users pointed out, the condition should be:

where ((entrytype = 4000 or entrytype = 4001 or entrytype = 4002) and (archivestatus <= 1))

NOT

where (entrytype = 4000 or entrytype = 4001 or entrytype = 4002 and (archivestatus <= 1))

Sorry for the confusion.


Solution

  • After analyzing different versions of postgres, 12.6 and 13.2 on Windows and even 13.2 on Linux docker, I could never reproduce the issue. We concluded that it must have been due to the environment acting up. Either that or Disk or Network issues. Either way, this had was no Postgres error.