How can I optimize the table or the query the following pgsql query(takes 34 min. to get 770 records)? Already indexes are added to the table for few columns. Not sure what else can make this query
Query:
SELECT
min(p.start_timestamp AT TIME ZONE p.timezone AT TIME ZONE 'America/Los_Angeles') as Date,
'America/Los_Angeles' AS Timezone,
sum(GREATEST(0, p.value)) as Value,
p.uom as UnitOfMeasurement
FROM
pv.bsa_vessel_vs p
WHERE
p.start_timestamp AT TIME ZONE p.timezone >= '2017-01-01'
and p.start_timestamp AT TIME ZONE p.timezone < '2017-02-01'
and p.vessel_serial_number ='U57625059'
GROUP BY
date_trunc('hour', p.start_timestamp AT TIME ZONE p.timezone AT TIME ZONE 'America/Los_Angeles'), p.uom
ORDER BY
Date ;
Table:
CREATE TABLE pv.bsa_vessel_vs
(
bsa_vessel_vs_id bigserial NOT NULL,
data_source_id bigint NOT NULL,
start_timestamp timestamp without time zone NOT NULL,
end_timestamp timestamp without time zone NOT NULL,
value numeric(12,4) NOT NULL,
uom text NOT NULL,
timezone text NOT NULL,
created_timestamp timestamp without time zone DEFAULT now(),
updated_timestamp timestamp without time zone DEFAULT now(),
vessel_serial_number text NOT NULL,
CONSTRAINT bsa_vessel_vs_pkey PRIMARY KEY (bsa_vessel_vs_id),
CONSTRAINT bsa_vessel_vs_data_source_id_fkey FOREIGN KEY (data_source_id)
REFERENCES pv.data_source (data_source_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE RESTRICT
)
WITH (
OIDS=FALSE
);
CREATE INDEX pm_start_timestamp_ndex
ON pv.bsa_vessel_vs
USING btree
(start_timestamp DESC NULLS LAST);
CREATE INDEX bsa_vessel_vs_meter_ts_idx
ON pv.bsa_vessel_vs
USING btree
(vessel_serial_number COLLATE pg_catalog."default", start_timestamp, end_timestamp);
CREATE UNIQUE INDEX bsa_vessel_vs_u_idx
ON pv.bsa_vessel_vs
USING btree
(data_source_id, vessel_serial_number COLLATE pg_catalog."default", start_timestamp, end_timestamp DESC);
Thanks Karthey
Change your index so that it includes the same expressions you use in your WHERE
clause, that is:
CREATE INDEX bsa_vessel_vs_meter_ts_2_idx
ON bsa_vessel_vs
USING btree
( vessel_serial_number COLLATE pg_catalog."default",
(start_timestamp AT TIME ZONE timezone),
(start_timestamp AT TIME ZONE timezone)
);
When you define that index, you'll get an execution plan that uses it:
| QUERY PLAN | | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Sort (cost=69.60..69.70 rows=39 width=83) | | Sort Key: (min(timezone('America/Los_Angeles'::text, timezone(timezone, start_timestamp)))) | | -> HashAggregate (cost=67.79..68.57 rows=39 width=83) | | Group Key: date_trunc('hour'::text, timezone('America/Los_Angeles'::text, timezone(timezone, start_timestamp))), uom | | -> Index Scan using bsa_vessel_vs_meter_ts_2_idx on bsa_vessel_vs p (cost=0.28..67.20 rows=39 width=44) | | Index Cond: ((vessel_serial_number = 'U57625059'::text) AND (timezone(timezone, start_timestamp) >= '2017-01-01 00:00:00+00'::timestamp with time zone) AND (timezone(timezone, start_timestamp) < '2017-02-01 00:00:00+00'::timestamp with time zone)) |
Whereas, if the index is not there, PostgreSQL resorts to a full table scan:
| QUERY PLAN | | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Sort (cost=298.84..298.94 rows=39 width=83) | | Sort Key: (min(timezone('America/Los_Angeles'::text, timezone(timezone, start_timestamp)))) | | -> GroupAggregate (cost=296.35..297.81 rows=39 width=83) | | Group Key: (date_trunc('hour'::text, timezone('America/Los_Angeles'::text, timezone(timezone, start_timestamp)))), uom | | -> Sort (cost=296.35..296.45 rows=39 width=44) | | Sort Key: (date_trunc('hour'::text, timezone('America/Los_Angeles'::text, timezone(timezone, start_timestamp)))), uom | | -> Seq Scan on bsa_vessel_vs p (cost=0.00..295.32 rows=39 width=44) | | Filter: ((vessel_serial_number = 'U57625059'::text) AND (timezone(timezone, start_timestamp) >= '2017-01-01 00:00:00+00'::timestamp with time zone) AND (timezone(timezone, start_timestamp) < '2017-02-01 00:00:00+00'::timestamp with time zone)) |
You can check all the setup at dbfiddle here