I'm encountering a situation where Oracle seemingly does not perform an optimization in its execution plan that it should, which causes degraded performance.
I'm working with an Oracle 23 FREE (for testing via docker run -p 1521:1521 -e ORACLE_PASSWORD=oracle gvenzl/oracle-free:23-slim-faststart
), but I encountered the same behaviour in Oracle 19c.
This is the table in question:
CREATE TABLE T_WORKQUEUE
(
ID NUMBER(19) PRIMARY KEY,
DOMAIN_ID VARCHAR2(255) NOT NULL
);
I have a snippet that fills this table with 10 million random rows, where each DOMAIN_ID
is duplicated 10 times, where the results may look something like this (select * from T_WORKQUEUE order by DOMAIN_ID
):
ID;DOMAIN_ID
843038;999809LYJP
843039;999809LYJP
843040;999809LYJP
843031;999809LYJP
843037;999809LYJP
843033;999809LYJP
843034;999809LYJP
843035;999809LYJP
843036;999809LYJP
843032;999809LYJP
8873062;9997OTNKT5
8873063;9997OTNKT5
8873064;9997OTNKT5
...
Then, I add the following index:
create index IDX_WORKQUEUE_DOMAIN_ID on T_WORKQUEUE(DOMAIN_ID ASC);
And make sure the statistics are up-to-date:
begin
dbms_stats.gather_table_stats('SYSTEM', 'T_WORKQUEUE');
dbms_stats.gather_index_stats('SYSTEM', 'IDX_WORKQUEUE_DOMAIN_ID');
end;
/
Now, the slow query in question looks like this:
select max(t.ID) from T_WORKQUEUE t
group by t.DOMAIN_ID
order by t.DOMAIN_ID
fetch first 2 rows only;
It takes about 5 seconds and has the following execution plan, emphasis by me:
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 310 | | 77843 (2)| 00:00:04 |
| 1 | VIEW | | 2 | 310 | | 77843 (2)| 00:00:04 |
| 2 | WINDOW **SORT** PUSHED RANK| | 997K| 16M| 268M| 77843 (2)| 00:00:04 |
| 3 | HASH GROUP BY | | 997K| 16M| 268M| 77843 (2)| 00:00:04 |
| 4 | **TABLE ACCESS FULL** | T_WORKQUEUE | 10M| 162M| | 8246 (1)| 00:00:01 |
------------------------------------------------------------------------------------------------
I expected Oracle to make use of the Index' order and not require any SORT
operations, and instead just read the index with a STOPKEY
. Why can't Oracle optimize this?
Interestingly enough, if I remove the ORDER BY
clause, Oracle does use the expected execution plan, and actually retrieves the results in index order (though I understand this can't be relied upon), emphasis by me:
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 52 | 4 (0)| 00:00:01 |
| 1 | VIEW | | 2 | 52 | 4 (0)| 00:00:01 |
| 2 | WINDOW **NOSORT STOPKEY** | | 2 | 34 | 4 (0)| 00:00:01 |
| 3 | SORT GROUP BY **NOSORT** | | 2 | 34 | 4 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| T_WORKQUEUE | 10M| 162M| 4 (0)| 00:00:01 |
| 5 | **INDEX FULL SCAN** | IDX_WORKQUEUE_DOMAIN_ID | 2 | | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------
The problem with the ORDER BY
not using the index is due to some hidden detail, namely the database's default language settings. ORDER BY
follows the database's NLS_SORT
setting, which may be set to something different than BINARY
, e.g. XGERMAN_AI
. This means that the order follows different rules.
The index uses BINARY
sorting, which then becomes a mismatch and forces Oracle to re-sort the entire data in the desired NLS_SORT
setting. This can be fixed by e.g. setting the 'NLS_SORT' to 'BINARY':
ALTER SESSION SET NLS_SORT='BINARY';
select max(t.ID) from T_WORKQUEUE t
group by t.DOMAIN_ID
order by t.DOMAIN_ID
fetch first 2 rows only;
which then has the desired fast execution plan:
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 310 | 4 (0)| 00:00:01 |
| 1 | VIEW | | 2 | 310 | 4 (0)| 00:00:01 |
| 2 | WINDOW NOSORT STOPKEY | | 2 | 34 | 4 (0)| 00:00:01 |
| 3 | SORT GROUP BY NOSORT | | 2 | 34 | 4 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| T_WORKQUEUE | 10M| 162M| 4 (0)| 00:00:01 |
| 5 | INDEX FULL SCAN | IDX_WORKQUEUE_DOMAIN_ID | 2 | | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------
Not directly related to this issue, but to keep the behaviour of comparisons < >
consistent with the behaviour of ORDER BY
, it might be desirable to set both NLS_SORT
and NLS_COMP
:
ALTER SESSION SET NLS_COMP='BINARY' NLS_SORT='BINARY';
Otherwise, things like keyset pagination can horribly break if comparisons (e.g. WHERE DOMAIN_ID > X
) behave differently than ORDER BY
(e.g. ORDER BY DOMAIN_ID
).