Search code examples
postgresqlquery-optimizationdatabase-partitioningpostgresql-12

Slow Query in partitioned table


We have postgresql12 and have a large table of size 108gb including index. Since, the query got slow, we tried partitioning the table. But this did not help.

    EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
SELECT
    val_a AS x,
    val_b AS y,
    SUM(value) AS value
FROM
    test_table_tp
WHERE
    client = '767jjDHIPLkshj'
    AND identity_1 = '12edfdijijasd'
    AND identity_2 = '98jjaskhuUUHss'
    AND identity_3 = 1
    AND date_col BETWEEN '2021-04-01'::date AND ('2021-07-01'::date + Interval '1 day')::date
GROUP BY
    val_a,
    val_b;

This takes around 3 minutes to execute.

->  HashAggregate  (cost=260438.41..260718.30 rows=27989 width=16) (actual time=214427.239..214427.614 rows=1298 loops=1)
               Output: test_table_tp.val_a, test_table_tp.val_b, sum(test_table_tp.value)
               Group Key: test_table_tp.val_a, test_table_tp.val_b
               Buffers: shared hit=216750 read=331936
               I/O Timings: read=211977.843
               ->  Append  (cost=0.56..258339.24 rows=279890 width=12) (actual time=3.057..213976.334 rows=722233 loops=1)
                     Buffers: shared hit=216750 read=331936
                     I/O Timings: read=211977.843
                     ->  Index Scan using idx_202104_108521 on public.test_table_tp_p2021_04_108521 test_table_tp  (cost=0.56..71154.06 rows=83535 width=12) (actual time=3.056..69033.315 rows=216908 loops=1)
                           Output: test_table_tp.val_a, test_table_tp.val_b, test_table_tp.value
                           Index Cond: (((test_table_tp.client)::text = '767jjDHIPLkshj'::text) AND ((test_table_tp.identity_1)::text = '12edfdijijasd'::text) AND ((test_table_tp.identity_2)::text = '98jjaskhuUUHss'::text) AND (test_table_tp.identity_3 = 1) AND (test_table_tp.date_col >= '2021-04-01'::date) AND (test_table_tp.date_col <= '2021-07-02'::date))
                           Buffers: shared hit=68990 read=96437
                           I/O Timings: read=68466.167
                     ->  Index Scan using idx_202105_108553 on public.test_table_tp_p2021_05_108553 test_table_tp_1  (cost=0.56..51361.84 rows=55441 width=12) (actual time=8.641..55685.999 rows=160618 loops=1)
                           Output: test_table_tp_1.val_a, test_table_tp_1.val_b, test_table_tp_1.value
                           Index Cond: (((test_table_tp_1.client)::text = '767jjDHIPLkshj'::text) AND ((test_table_tp_1.identity_1)::text = '12edfdijijasd'::text) AND ((test_table_tp_1.identity_2)::text = '98jjaskhuUUHss'::text) AND (test_table_tp_1.identity_3 = 1) AND (test_table_tp_1.date_col >= '2021-04-01'::date) AND (test_table_tp_1.date_col <= '2021-07-02'::date))
                           Buffers: shared hit=48314 read=69911
                           I/O Timings: read=55277.406
                     ->  Index Scan using idx_202106_108585 on public.test_table_tp_p2021_06_108585 test_table_tp_2  (cost=0.56..63581.82 rows=66779 width=12) (actual time=2.870..48249.339 rows=188842 loops=1)
                           Output: test_table_tp_2.val_a, test_table_tp_2.val_b, test_table_tp_2.value
                           Index Cond: (((test_table_tp_2.client)::text = '767jjDHIPLkshj'::text) AND ((test_table_tp_2.identity_1)::text = '12edfdijijasd'::text) AND ((test_table_tp_2.identity_2)::text = '98jjaskhuUUHss'::text) AND (test_table_tp_2.identity_3 = 1) AND (test_table_tp_2.date_col >= '2021-04-01'::date) AND (test_table_tp_2.date_col <= '2021-07-02'::date))
                           Buffers: shared hit=54983 read=90249
                           I/O Timings: read=47732.316
                     ->  Index Scan using idx_202107_108617 on public.test_table_tp_p2021_07_108617 test_table_tp_3  (cost=0.56..70842.08 rows=74135 width=12) (actual time=2.849..40902.561 rows=155865 loops=1)
                           Output: test_table_tp_3.val_a, test_table_tp_3.val_b, test_table_tp_3.value
                           Index Cond: (((test_table_tp_3.client)::text = '767jjDHIPLkshj'::text) AND ((test_table_tp_3.identity_1)::text = '12edfdijijasd'::text) AND ((test_table_tp_3.identity_2)::text = '98jjaskhuUUHss'::text) AND (test_table_tp_3.identity_3 = 1) AND (test_table_tp_3.date_col >= '2021-04-01'::date) AND (test_table_tp_3.date_col <= '2021-07-02'::date))
                           Buffers: shared hit=44463 read=75339
                           I/O Timings: read=40501.954
             Planning Time: 18.081 ms
             Execution Time: 214427.963 ms
 Planning Time: 0.083 ms
 Execution Time: 214461.427 ms
(41 rows)

Time: 214462.391 ms (03:34.462)

This is already an aggregated table which contains monthly aggregated data. Further aggregation is also not possible as there are many filters that can be applied. The identity columns have unique key in the table.

Can anything be done to further optimize this? Any help is greatly appreciated. Thanks you in advance.


Solution

  • This query will become slower with partitioning, although not much.

    Apart from getting a faster disk or more RAM, the only ways to speed up this query are

    1. include val_a, val_b and value into the index and VACUUM the table, so that you get a fast index-only scan

      CREATE INDEX ON test_table_tp (
         identity_1,
         identity_2,
         identity_3,
         date_col,
         val_a,
         val_b,
         value
      );
      
    2. CLUSTER the table using the index, so that the rows are stored in fewer blocks