Search code examples
oraclequery-performancebind-variables

Oracle optional bind variables


I have a query to select users from a table, given user id. This parameter is optional.

This is the query:

SELECT * FROM USERS
WHERE (USER_ID = :USER_ID OR :USER_ID IS NULL)
ORDER BY USER_ID;

Now I execute the query finding one user, so :USER_ID takes the valor 1 :

SELECT * FROM USERS
WHERE (USER_ID = 1 OR 1 IS NULL)
ORDER BY USER_ID;

This query takes 5 seconds.

And then, I add to the previous query OR :USER_ID IS NULL many times. This example takes much more time than the first:

SELECT * FROM USERS
WHERE (USER_ID = 1 OR 1 IS NULL [OR 1 IS NULL]x100)
ORDER BY USER_ID;

This query takes 30 seconds.


The execution plan are the same in the two examples:

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |  3256K|   695M|       |   682K  (1)| 00:00:27 |       |       |
|   1 |  SORT ORDER BY              |         |  3256K|   695M|   877M|   682K  (1)| 00:00:27 |       |       |
|   2 |   PARTITION RANGE ALL       |         |  3256K|   695M|       |   534K  (1)| 00:00:21 |     1 |1048575|
|*  3 |    TABLE ACCESS STORAGE FULL| USERS |  3256K|   695M|       |   534K  (1)| 00:00:21 |     1 |1048575|

Version of Oracle: Oracle Database 12c


Why oracle does not take the first statement, that it's always true, and stop evaluating the rest?


Solution

  • Your problem is the FULL TABLE SCAN on a large table triggered by the ORpredicate.

    Based on the value of the bind variable the query returns either one row (if the bind variable is not NULL) or the whole table otherwise.

    For only one bind variable you may use the NVL trick

    SELECT * FROM USERS
    WHERE (USER_ID = nvl(:USER_ID, USER_ID))
    ORDER BY USER_ID;
    

    which leads to a execution plan consisting of two parts covering both cases :

    BV is NULL -> FULL SCAN

    BV is NOT NULL -> INDEX ACCES

    --------------------------------------------------------------------------------------------
    | Id  | Operation                      | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT               |           |  8329 |  9313K|   941   (1)| 00:00:12 |
    |   1 |  SORT ORDER BY                 |           |  8329 |  9313K|   941   (1)| 00:00:12 |
    |   2 |   CONCATENATION                |           |       |       |            |          |
    |*  3 |    FILTER                      |           |       |       |            |          |
    |*  4 |     TABLE ACCESS FULL          | USERS     |  8247 |  9221K|   925   (1)| 00:00:12 |
    |*  5 |    FILTER                      |           |       |       |            |          |
    |   6 |     TABLE ACCESS BY INDEX ROWID| USERS     |    82 | 93890 |    15   (0)| 00:00:01 |
    |*  7 |      INDEX RANGE SCAN          | USERS_IDX |  1110 |       |     1   (0)| 00:00:01 |
    --------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       3 - filter(:USER_ID IS NULL)
       4 - filter("USER_ID" IS NOT NULL)
       5 - filter(:USER_ID IS NOT NULL)
       7 - access("USER_ID"=:USER_ID)
    

    So this will response quickly, if the BV is passed (not NULL) AND index on USER_IDis defined. This will lead to a FULL TABLE SCAN (5 seconds) AND SORT of the whole table (my guess an other 25 seconds), giving total 30 seconds response.

    Note that if you pass the BV, you performs only the FULL TABLE SCAN, the SORT time is neglectible as only one records is returned (assuming USER_ID is PK) - which explains the difference in the response time.