Search code examples
postgresqldatabase-performancequery-performance

Postgres select query runs slow when using JDBC but quick when running it in PSQL from the same server


I have the following, pretty simple, query that I am running using JDBC that is taking an excessive amount of time to run. The database is an AWS RDS server. The rt2 table has about 600K entries and the CM2 table about 300K. The query returns 11230 rows.

SELECT cm2.target 
from sysmgmt.sys_root rt2 
   join cmgmt.member cm2 on cm2.cmid = rt2.cmid and cm2.version=rt2.work_version_id 
where rt2.tid=1001 
  and rt2.proj='d791194b-f2b7-42a7-aba7-f879e052e59d'::uuid 
  and rt2.deleted = false 
  and cm2.tid=1001 and cm2.proj = 'd791194b-f2b7-42a7-aba7-f879e052e59d'::uuid;

When I run this query with the JDBC call it takes 40 seconds! However, if I run this exact same query in a PSQL command line on the same machine it is almost instantaneous.

Running EXPLAIN ANALYZE shows the following plan.

Nested Loop  (cost=0.85..7.77 rows=1 width=176) (actual time=0.030..36.067 rows=11230 loops=1)
->  Index Scan using m_cell_tid_proj_version_idx on member cm2  (cost=0.42..3.32 rows=1 width=197) (actual time=0.020..2.988 rows=11230 loops=1)
    Index Cond: ((tid = '1001'::numeric) AND (proj = 'ed1a7c79-a3a1-4d8e-815b-0fbbcbd7bf4b'::uuid))
->  Index Scan using sys_root_cmid_workversion_idx on sys_root rt2  (cost=0.42..4.45 rows=1 width=21) (actual time=0.002..0.002 rows=1 loops=11230)
    Index Cond: ((cmid = cm2.cmid) AND (work_version_id = cm2.version))
    Filter: ((NOT deleted) AND (tid = '1001'::numeric) AND (proj = 'ed1a7c79-a3a1-4d8e-815b-0fbbcbd7bf4b'::uuid))
Planning Time: 0.374 ms
Execution Time: 36.499 ms

Some things I have tried to speed it up.

  • rearrange the query
  • Adding indexes that match better
  • Change fillfactor (didn't seem to have any effect)
  • VACUUM

None of these seem to have any effect. The java code is pretty straight forward, run the query and then iterate over the results. The timing is taken just before executeQuery and just after it.

  Took :[40644.067138] Comment:found 11230 SQL Query:[SELECT cm2.target from sysmgmt.sys_root rt2 join mgmt.member cm2 on cm2.cmid = rt2.cmid and cm2.version=rt2.work_version_id where rt2.tid=1001 and rt2.proj='ed1a7c79-a3a1-4d8e-815b-0fbbcbd7bf4b'::uuid and cm2.tid=1001 and cm2.proj = 'ed1a7c79-a3a1-4d8e-815b-0fbbcbd7bf4b'::uuid and rt2.deleted = false]

There are about 5 to 10 other queries that are running in the same transaction, is it possible that the other queries cause an issue downstream to when it gets to this query?

If anyone has any ideas on what might be the problem, I would appreciate some insight.


Solution

  • It turns out that the Explain Analyze plan was different when it was being run in in the context of the activity. Turning up the auto_explain parameters and having it log to the postgres log files showed that the plan was different than when I ran it as a standalone request. The question then became "how to get it to do the right thing?". The answer to that was to turn up the default_statistics_target to 200 from 100 and run ANALYZE on the database. I also rearranged the join table order. In doing both of these, the problem has gone away (for good hopefully). This wiki https://wiki.postgresql.org/wiki/Performance_Optimization also proved to be a great resource.