I'm joining a table with itself but although I expect this operation to use index, it seems it doesn't. There are 1 million records on the table(MY_TABLE
) and the query I run is executing on about 10 thousand records.(So it is lower than %1 of whole table.)
Test Case:
explain plan for
SELECT *
FROM SCHM.MY_TABLE A1, SCHM.MY_TABLE A2
WHERE (A1.K_ID = '123abc')
AND A1.HDT = A2.HDT
AND A2.C_DATE BETWEEN A1.SYSDATE - 0.0004
AND A1.SYSDATE + 0.0004
AND A1.GKID = A2.GKID;
Plan hash value: 1210306805
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 81 | 28 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
|* 2 | HASH JOIN | | 3 | 81 | 28 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID BATCHED| MY_TABLE | 3 | 45 | 7 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | IX_MY_TABLE_C_DATE | 3 | | 4 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID BATCHED| MY_TABLE | 17 | 204 | 21 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | IX_MY_TABLE_K_ID | 17 | | 4 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SYSDATE@!+0.00004>=SYSDATE@!-0.00004)
2 - access("A1"."HDT"="A2"."HDT" AND
"A1"."GKID"="A2"."GKID")
4 - access("A2"."C_DATE">=SYSDATE@!-0.00004 AND
"A2"."C_DATE"<=SYSDATE@!+0.00004)
6 - access("A1"."K_ID"=U'123abc')
In the above statement, it can be seen that the index on C_DATE is used.
However, in the below statement, the index on C_DATE is not used. So, the query runs really slow.
Real Case:
explain plan for
SELECT *
FROM SCHM.MY_TABLE A1, SCHM.MY_TABLE A2
WHERE (A1.K_ID = '123abc')
AND A1.HDT = A2.HDT
AND A2.C_DATE BETWEEN A1.C_DATE - 0.0004
AND A1.C_DATE + 0.0004
AND A1.GKID = A2.GKID;
Plan hash value: 1063167343
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4187K| 998M| 6549K (1)| 00:04:16 |
|* 1 | HASH JOIN | | 4187K| 998M| 6549K (1)| 00:04:16 |
| 2 | JOIN FILTER CREATE | :BF0000 | 17 | 2125 | 21 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID BATCHED| MY_TABLE | 17 | 2125 | 21 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | IX_MY_TABLE_K_ID | 17 | | 4 (0)| 00:00:01 |
| 5 | JOIN FILTER USE | :BF0000 | 1429M| 166G| 6546K (1)| 00:04:16 |
|* 6 | TABLE ACCESS STORAGE FULL | MY_TABLE | 1429M| 166G| 6546K (1)| 00:04:16 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access(A1.HDT=A2.HDT AND
A1.GKID=A2.GKID)
filter(A2.C_DATE>=INTERNAL_FUNCTION(A1.C_DATE)-0.00004 AND
A2.C_DATE<=INTERNAL_FUNCTION(A1.C_DATE)+0.00004)
4 - access(A1.K_ID=U'123abc')
6 - storage(SYS_OP_BLOOM_FILTER(:BF0000,A2.HDT,A2.HDT))
filter(SYS_OP_BLOOM_FILTER(:BF0000,A2.HDT,A2.HDT))
If I use the hint /*+index(A2,IX_MY_TABLE_C_DATE )*/
, everthing is fine, the index is used and the query runs fast as I want.
The query in the real case cannot be changed because it is created by application.
Index Information:
K_ID, not unique, position 1
HDT, not unique, position 1
C_DATE, not unique, position 1
ID Unique and Primary Key, position 1
What do I have to change in order the query in the real case to use index?
Well, the second query is slower since it's quite different from the first one. It has an extra join between tables:
AND A2.C_DATE BETWEEN A1.C_DATE - 0.0004
AND A1.C_DATE + 0.0004
and on a million rows this takes a toll.
The first query doesn't have this join condition and both tables are:
The second query needs to perform:
This is way slower.
I suggest adding the following indexes, and try again:
create index ix_1 (k_id);
create index ix_2 (hdt, gkid, c_date);