Edit:
Database - Oracle 11gR2 - Over Exadata (X2)
Am framing a problem investigation report for past issues, and I'm slightly confused with a situation below.
Say , I have a table MYACCT
. There are 138
Columns. It holds 10 Million
records. With regular updates of at least 1000 records(Inserts/Updates/Deletes) per hour.
Primary Key is COL1 (VARCHAR2(18))
(Application rarely uses this, except for a join with other tables)
There's Another Unique Index over COL2 VARCHAR2(9))
. Which is the one App regularly uses. What ever updates I meant previously happens based on both these columns. Whereas any SELECT
only operation over this table, always refer COL2
. So COL2
would be our interest.
We do a Query below,
SELECT COUNT(COL2) FROM MYACCT; /* Use the Unique Column (Not PK) */
There's no issue with the result, whereas I was the one recommending to change it as
SELECT COUNT(COL1) FROM MYACCT; /* Use the primary Index
I just calculated the time taken for actual execution
Query using the PRIMARY KEY was faster by `0.8-1.0 seconds always!
Now, I am trying to explain this behavior. Just drafting the Explain plan behind these Queries.
SELECT COUNT(COL1) FROM MYACCT;
Plan :
SQL> select * from TABLE(dbms_xplan.display);
Plan hash value: 2417095184
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 11337 (1)| 00:02:17 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX STORAGE FAST FULL SCAN| PK_ACCT | 10M| 11337 (1)| 00:02:17 |
---------------------------------------------------------------------------------
9 rows selected.
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
41332 consistent gets
0 physical reads
0 redo size
210 bytes sent via SQL*Net to client
346 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SELECT COUNT(COL2) FROM MYACCT;
Plan :
Plan hash value: 1130703739
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 10 | 7868 (1)| 00:01:35 |
| 1 | SORT AGGREGATE | | 1 | 10 | | |
| 2 | INDEX STORAGE FAST FULL SCAN| MYINDX01 | 10M| 95M| 7868 (1)| 00:01:35 |
------------------------------------------------------------------------------------------
9 rows selected.
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
28151 consistent gets
23 physical reads
784 redo size
233 bytes sent via SQL*Net to client
346 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
We can find in terms of
Cost
andTime
the Query without primary key wins. Then how come the execution time of primary key is faster???
EDIT:
SQL> select segment_name, bytes from dba_segments where segment_name in ('MYINDX01','PK_ACCT');
PK_ACCT 343932928
MYINDX01 234881024
You're reading more data from the PK index than you are from the other one. COL1
is VARCHAR2(18)
while COL2
is VARCHAR(9)
, which doesn't necessarily mean anything but implies you probably have values in COL1
that are consistently longer than those in COL2
. They will therefore use more storage, both in the table and in the index, and the index scan has to pull more data from the block buffer and/or disk for the PK-based query.
The execution statistics show that; 41332 consistent gets for the PK-based query, and only 28151 for the faster one, so it's doing more work with the PK. And the segment sizes show it too - for the PK you need to read about 328M, for the UK only 224M.
The block buffer is likely to be crucial if you're seeing the PK version run faster sometimes. In the example you've shown both queries are hitting the block buffer - the 23 physical reads are a trivial number, If the index data wasn't cached consistently then you might see 41k consistent gets versus 28k physical reads, which would likely reverse the apparent winner as physical reads from disk will be slower. This often manifests if running two queries back to back shows one faster, but reversing the order they run shows the other as faster.
You can't generalise this to 'PK query is slower then UK query'; this is because of your specific data. You'd probably also get better performance if your PK was actually a number column, rather than a VARCHAR2
column holding numbers, which is never a good idea.