Here my query is:
explain(buffers, analyze) SELECT DISTINCT e.eventid, e.objectid, e.clock, e.ns, e.name, e.severity
FROM EVENTS e, functions f, items i, hosts_groups hg
WHERE e.source='0' AND e.object='0' AND NOT EXISTS
(SELECT NULL FROM functions f, items i, hosts_groups hgg
LEFT JOIN rights r ON r.id=hgg.groupid AND r.groupid IN (12, 13, 14, ...)
WHERE e.objectid=f.triggerid AND f.itemid=i.itemid AND i.hostid=hgg.hostid
GROUP BY i.hostid HAVING MAX(permission)<2 OR MIN(permission) IS NULL OR MIN(permission)=0)
AND e.objectid=f.triggerid AND f.itemid=i.itemid AND i.hostid=hg.hostid
AND hg.groupid IN (1, 2, 3, ...)
AND e.value=1
ORDER BY e.eventid DESC;
You can find the related execution plan here.
As you can see, it spills to the disk. Because default value of work_mem
is 8 MB
. Than, I set work_mem
to 1 GB
on my session to see difference and run the query again. The new execution plans is here. Now, it is doing quicksort but still, the execution time is 779213.763 ms
.
This query is a auto - generated query by a third party tool but we can change it I assume.
Doing distinct - sort
for ~602k rows is insane. That is why I want to add more filter for clock column. Yet, I want to ask is there any other options to decrease execution time of this query?
Specifications for database server:
$ lscpu
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 2
NUMA node(s): 1
Memory: 96 GB
The database settings for:
max_parallel_workers_per_gather
---------------------------------
4
max_worker_processes
----------------------
16
max_parallel_workers
----------------------
16
Thanks!
It looks like the core of the problem is that the planner is not using a hashed subplan (where it runs it in bulk once and memorizes the results in a hash) for the NOT EXISTS, but rather is running it parameterized for each tuple in a loop. Usually this is because the planner thinks it will take too much memory to hash the results, but in this case I think it is just because it can not figure out how to analyze GROUP BY...HAVING.
You can guide it down the (presumably) correct path here by replacing the NOT EXISTS (...) with:
AND e.objectid NOT IN (
SELECT triggerid FROM functions f, items i, hosts_groups hgg
LEFT JOIN rights r ON r.id=hgg.groupid AND r.groupid IN (12, 13, 14 /*...*/)
WHERE f.itemid=i.itemid AND i.hostid=hgg.hostid
GROUP BY triggerid, i.hostid HAVING MAX(permission)<2 OR MIN(permission) IS NULL OR MIN(permission)=0
)
But before trying this, I might run just the inner query there by itself to see how long it takes and how many rows it returns.
If this ends up working, it might be worthwhile to investigate what it would take to make the planner smart enough to do this conversion on its own.