I have the following query which performs badly:
select
distinct
u.uuid
u.user_name,
u.key
from request req
join int_user u on u.uuid = req.user_uuid
join int_right r on r.uuid = req.right_uuid
where r.uuid in (
select r2.uuid from int_right r2
where
(
lower(r2.right_name) like '%keyword%'
or lower(r2.right_key) like '%keyword%'
)
)
The sub-query is uncorrelated and it will usually return few rows, sometimes only one row. Now I don't understand why if I take the sub-query and execute it separately then take the result list and add it statically with IN operator to outer query then it will perform very well, from 3-6s execution time down to 0.05s.
r.uuid in ('value1', 'value2', 'value3')
How could I tell oracle to execute my sub-query first then apply the result set to outer query?
Few notes:
From execution plan it seems that oracle does a full scan in all tables. The cost and cardinality is very big on request table. It's also interesting that even if my sub-query would return a single row for a certain search criteria the query is still slow, however if I would replace IN operator with equals(=) then the query becomes very fast and the cost low. It also appears that in this case oracle would only do a full scan in int_right table and for other table it would do a unique or range scan.
I also tried other variants of this query, like adding the conditions directly to outer query, use exists or correlated sub-query but it's still slow in any case.
You generally cannot tune a SQL statement by looking at the text (unless there is a fundamental flaw in the code, eg missing a join condition etc). For Oracle, one of the most productive ways to proceed is:
1) Execute the problematic statement with an additional hint as below
select /*+ gather_plan_statistics */ ... <rest of query>
2) Run the following to get execution plan metrics
select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'))
That way you will:
a) See the true execution plan that was used,
b) Get estimate/actual row source counts for each step in the plan. If the estimate vs actuals are way out, then that typically is where to focus your attention because this is where the optimizer most probably did not have sufficient or accurate enough information to work with.
For example
SQL> select /*+ gather_plan_statistics */ count(dname)
2 from scott.emp e, scott.dept d
3 where e.sal <= 1500
4 and d.deptno = e.deptno;
COUNT(DNAME)
------------
7
1 row selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------
SQL_ID c1cb4s8b141h8, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(dname) from scott.emp e,
scott.dept d where e.sal <= 1500 and d.deptno = e.deptno
Plan hash value: 3037575695
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 9 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 9 |
| 2 | MERGE JOIN | | 1 | 3 | 7 |00:00:00.01 | 9 |
| 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 4 | 4 |00:00:00.01 | 2 |
| 4 | INDEX FULL SCAN | DEPT_PK | 1 | 4 | 4 |00:00:00.01 | 1 |
|* 5 | SORT JOIN | | 4 | 3 | 7 |00:00:00.01 | 7 |
|* 6 | TABLE ACCESS FULL | EMP | 1 | 3 | 7 |00:00:00.01 | 7 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("D"."DEPTNO"="E"."DEPTNO")
filter("D"."DEPTNO"="E"."DEPTNO")
6 - filter("E"."SAL"<=1500)
On line 6, you can that the optimizer estimated 3 rows but actually got back 7. Large discrepancies indicate areas to investigate.