Search code examples
sqloracleoracle18c

In Oracle 18c, why is max() much less efficient than order by with fetch first 1 rows only?


I have a table with about a billion rows that is partitioned by date and has a (local) index on someNr. This query:

select max("date") from T where someNr = 1;

executes within milliseconds, as expected. However, if there is no entry with someNr = 1, the query takes about a minute. I then found out that:

select "date" from T where someNr = 1 order by "date" desc fetch first 1 rows only;

executes within milliseconds, regardless of whether there is an entry with someNr = 1.

The actual plan in our production DB for max("date"): enter image description here

The actual plan in our production DB for order by: enter image description here

After searching for over an hour, I could not find a satisfying explanation. Either they had no proper index, used rownum < 2 or the answer didn't really answer the question.

The following minimal reproducer is similar to the real world scenario and shows the difference in the explain plan:

SELECT BANNER FROM v$version;

create table T ("DATE" date not null, SOME_NR int not null);
alter table T
  modify PARTITION BY RANGE ("DATE")
    INTERVAL (NUMTODSINTERVAL(1, 'DAY'))
    (partition "old" values less than (to_date('01-JAN-2023', 'DD-MON-YYYY') ));
create index I_T_DATE ON T ("DATE", SOME_NR);

explain plan for
select max("DATE") from T where SOME_NR = 1;

select plan_table_output
from table (dbms_xplan.display('plan_table', null, 'basic'));

explain plan for
select "DATE" from T where SOME_NR = 1 order by "DATE" fetch first 1 rows only;

select plan_table_output
from table (dbms_xplan.display('plan_table', null, 'basic'));

Output:

Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production

+-------------------------------------------------+
|PLAN_TABLE_OUTPUT                                |
+-------------------------------------------------+
|Plan hash value: 1357496781                      |
|                                                 |
|-------------------------------------------------|
|| Id  | Operation                    | Name     ||
|-------------------------------------------------|
||   0 | SELECT STATEMENT             |          ||
||   1 |  SORT AGGREGATE              |          ||
||   2 |   FIRST ROW                  |          ||
||   3 |    PARTITION RANGE ALL       |          ||
||   4 |     INDEX FULL SCAN (MIN/MAX)| I_T_DATE ||
+-------------------------------------------------+


+-------------------------------------------+
|PLAN_TABLE_OUTPUT                          |
+-------------------------------------------+
|Plan hash value: 3984456263                |
|                                           |
|-------------------------------------------|
|| Id  | Operation              | Name     ||
|-------------------------------------------|
||   0 | SELECT STATEMENT       |          ||
||   1 |  VIEW                  |          ||
||   2 |   WINDOW NOSORT STOPKEY|          ||
||   3 |    PARTITION RANGE ALL |          ||
||   4 |     INDEX FULL SCAN    | I_T_DATE ||
+-------------------------------------------+

So why, as it seems, is Oracle's max() not smart enough to just sort by date descending and use the first value in the same way that order by + fetch first 1 rows only? I understand that max() may return null whereas the second option doesn't, but this really shouldn't matter at all.


Solution

  • You're right that Oracle's MIN/MAX index algorithm is not very smart. It appears to only be capable of traversing an index one time to get either the largest or the smallest value. And Oracle does not often transform queries to use MIN/MAX, so you'll need to rewrite your query to take advantage of that algorithm.

    I can't truly answer why the algorithm isn't smart. I would guess it's a vicious circle, where Oracle didn't invest much effort into making MIN/MAX smart initially, so it's not used often, therefore it's not worth investing more effort to improve. The MIN/MAX option is not even listed in the otherwise-comprehensive PLAN_TABLE reference in the documentation.

    As an example of how dumb MIN/MAX can be, a query can only use either MIN or MAX, not both. A query like SELECT MIN(A),MAX(B) FROM SOME_TABLE needs to be re-written as SELECT (SELECT MIN(A) FROM SOME_TABLE), (SELECT MAX(B) FROM SOME_TALBE) FROM DUAL to take advantage of MIN/MAX. See my answer here for more details about this limitation.

    But understanding the algorithm's limitations can help us craft a query that might work better. We can use MIN/MAX once to quickly get the maximum value, and then plug that value back into the query. The below plan generates one "INDEX FULL SCAN (MIN/MAX)" and one "INDEX RANGE SCAN", but reading the index twice with those operations is likely faster than your query which performs a regular "INDEX FULL SCAN".

    explain plan for
    select "DATE"
    from T
    where SOME_NR = 1
      --This subquery improves performance by enabling the (MIN/MAX) option.
      and "DATE" = (select max("DATE") from T where SOME_NR = 1)
    order by "DATE" fetch first 1 rows only;
    
    select plan_table_output
    from table (dbms_xplan.display('plan_table', null, 'basic'));
    
    
    Plan hash value: 206361108
     
    ---------------------------------------------------
    | Id  | Operation                      | Name     |
    ---------------------------------------------------
    |   0 | SELECT STATEMENT               |          |
    |   1 |  VIEW                          |          |
    |   2 |   WINDOW NOSORT STOPKEY        |          |
    |   3 |    INDEX RANGE SCAN            | I_T_DATE |
    |   4 |     SORT AGGREGATE             |          |
    |   5 |      FIRST ROW                 |          |
    |   6 |       INDEX FULL SCAN (MIN/MAX)| I_T_DATE |
    ---------------------------------------------------