Search code examples
sqloracle-databaseoracle11ggroup-byquery-performance

Strange behavior of Oracle and group by


While analyzing performance of an SQL query in Oracle, I noticed a strange behavior. I noticed that Oracle's plan behavior changes depending on value used in query.

For example here is my table structure:


  CREATE TABLE "USAGE" 
   (    "ID" NUMBER(11,0) NOT NULL ENABLE, 
    "CREATED_DATE" TIMESTAMP (6), 
    "MODIFIED_DATE" TIMESTAMP (6),
    "PERIOD" TIMESTAMP (6) NOT NULL ENABLE, 
    "DOWNLOAD" NUMBER(19,0),
     PRIMARY KEY ("ID")
   );

  CREATE INDEX "USAGE_A0ACFA46" ON "USAGE" ("PERIOD");


  CREATE UNIQUE INDEX "USAG_PERIOD_772992E2_UNIQ" ON "USAGE" ("PERIOD");

When I fetched plan of following query, I see that table is accessed by INDEX RANGE SCAN, which is expected:


explain plan for
select usg.period, sum(usg.download)
 from usage usg
 where usg.period>=TIMESTAMP '2018-11-30 00:00:00'
    group by usg.period;

SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY());
----------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                           |     1 |    18 |     3   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT        |                           |     1 |    18 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| USAGE                     |     1 |    18 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | USAG_PERIOD_E67F63D3_UNIQ |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

However, when I change the value only, I noticed that table is accessed by TABLE ACCESS FULL, which is very strange for me:


select usg.period, sum(usg.download)
 from usage usg
 where usg.period>=TIMESTAMP '2017-11-30 00:00:00'
    group by usg.period;

SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY());
----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |   133 |  2394 |    69   (2)| 00:00:01 |
|   1 |  HASH GROUP BY     |       |   133 |  2394 |    69   (2)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| USAGE |  9505 |   167K|    68   (0)| 00:00:01 |
----------------------------------------------------------------------------

My question is, why does it happen? I would expect Oracle to use INDEX RANGE SCAN, no matter what the value is.

My database is Oracle 11g


Solution

  • the optimizer may decide whether to use or not to use an individual index depending on the amount of the data, for huge set of data the full-scan is preferred rather than the index range scan.

    Your second case seems scanning a bigger data set as being the interval is longer.

    As an example, try to restrict your scans for only one-month period

    Q1 :

    select usg.period, sum(usg.download)
      from usage usg
     where usg.period between timestamp'2017-11-01 00:00:00' and timestamp'2017-11-30 00:00:00'
     group by usg.period;
    

    and

    Q2 :

    select usg.period, sum(usg.download)
     from usage usg
     where usg.period between timestamp'2018-11-01 00:00:00' and timestamp'2018-11-30 00:00:00'
        group by usg.period;
    

    for both queries Q1 and Q2, you most probably can see a index range scan with close values of costs depending on the homogeneously populated data for the table. The indexes are mostly good for small number of rows.