Search code examples
sqlpostgresqlstored-proceduresquery-optimizationpostgresql-performance

Query optimization- How to achieve that in this query?


How can I optimize this query ? I have created indexes,partitions,increased worker memory but the execution time is still 35s. How can I minimize it to 10-15 seconds? Update :

  1. Removed conversion of every time stamp from utc to local time i.e. time_stamp AT TIME ZONE 'utc' AT TIME ZONE which improved the performance by approximately 5 seconds. Current execution time : 36.5 seconds.
explain analyse select
DATE_TRUNC('day', time_stamp) as "time_stamp",
COUNT(DISTINCT id) AS alarm_count,
COUNT(DISTINCT patient_id) AS patient_count
FROM
alarm_management.alarm
WHERE
tenant_name = 'abc'
and
unit = ANY('{a,b,c,d,e,f,g,h,i,j,k}'::text[])
AND
time_stamp BETWEEN '2021-09-15 02:25:00' AND '2021-12-14 04:36:45'
AND
severity_label = ANY('{a,b,c,d}'::text[])

AND derived_label IS NOT NULL
GROUP by 1

Explain(analyze, verbose, buffers) output-

GroupAggregate  (cost=3064683.77..3215681.44 rows=308821 width=24) (actual time=24242.730..35145.380 rows=91 loops=1)
  Group Key: (date_trunc('day'::text, alarm_hospitalc_burn_2021_9.time_stamp))
  ->  Sort  (cost=3064683.77..3101468.12 rows=14713740 width=40) (actual time=24167.513..25036.293 rows=16369464 loops=1)
        Sort Key: (date_trunc('day'::text, alarm_hospitalc_burn_2021_9.time_stamp))
        Sort Method: quicksort  Memory: 1672081kB
        ->  Append  (cost=0.00..1312964.42 rows=14713740 width=40) (actual time=0.308..20958.290 rows=16369464 loops=1)
              ->  Seq Scan on alarm_hospitalc_burn_2021_9  (cost=0.00..7175.10 rows=69691 width=40) (actual time=0.307..127.521 rows=94286 loops=1)
                    Filter: ((derived_label IS NOT NULL) AND (time_stamp >= '2021-09-15 02:25:00'::timestamp without time zone) AND (time_stamp <= '2021-12-14 04:36:45'::timestamp without time zone) AND (tenant_name = 'HospitalC'::text) AND (severity_label = ANY ('{"Short Yellow",Cyan,Red,Yellow}'::text[])) AND (unit = ANY ('{Burn,Delivery,EDI,EDT,EDW,ICU1,ICU2,ICU3P,ICU4P,PP,Tele}'::text[])))
              

Solution

  • The function can be written in SQL, that might be slightly faster:

    CREATE OR REPLACE FUNCTION dbo.get_time_group ( _date_type TEXT ) 
    RETURNS TEXT 
    LANGUAGE sql -- SQL is good enough
    IMMUTABLE -- better for performance, next call is faster because of caching
    AS 
    $$
        SELECT CASE $1 
                WHEN 'hour' THEN 'hour' 
                ELSE 'day' 
            END;
    $$;
    

    But the most important thing is the query plan.