Search code examples
oracle-databaseindexingreversedatabase-partitioning

Reverse key indexes with multiple columns strategy


I'm creating a local index on partitioned table the Table is partitioned X (List partition)

The table TEST looks like this:

X Y Z
AB 24 63
AB 24 65
CD 24 63
CD 24 65
EF 24 63

If I create an index

CREATE UNIQUE INDEX TEST_IDX ON TEST (X,Y,Z) LOCAL;

I "effectively" store the following

AB2463

AB2465

CD2463

CD2465

EF2463

If I create a reverse key index however, how is the data stored when using multiple columns?

CREATE UNIQUE INDEX TEST_IDX ON TEST (X,Y,Z) REVERSE LOCAL;

Is it stored as?

3643BA

5643BA

3643DC

5643DC

3643FE

OR is is stored as: (I'm hoping it's this one)

BA4236

BA4256

DC4236

DC4256

FE4236

In reality, this table has a few more columns and 100's of millions rows:

X has 50 distinct values

Y has 150 distinct values

Z has over 240000000 distinct values

I dropped and created the the index reversing the keys and its seems like there has been an improvement but I'm concerned about the partitioning and scaling this out

We have a few thousand updates that happen against this table every 30 mins or so and they are slow (0.20 per exec) and they're using that unique column

IE: (Note COL A,B,C are not shown above) update TEST set A = :1 , B = :2 , C = :3 where X = :4 AND Y = :5 AND Z =:6


Solution

  • Reverse indexes reverse the byte order for each column value. They do not reverse the order of columns in the index. So a reverse index would store:

    BA 42 36

    BA 42 56

    DC 42 36

    DC 42 56

    FE 42 36

    (in Oracle's encoding format of course. It doesn't store numbers as decimals so the reverse of 24 isn't 42, but we're just using decimal to make a point here)

    You can demonstrate this with block dumps. Dump of normal index:

    row#0[8014] flag: -------, lock: 2, len=18
    col 0; len 2; (2):  41 42  -- encodes A, B
    col 1; len 2; (2):  c1 19  -- encodes the number 24
    col 2; len 2; (2):  c1 40  -- encodes the number 63
    col 3; len 6; (6):  00 75 81 d9 00 00
    row#1[7996] flag: -------, lock: 2, len=18
    col 0; len 2; (2):  41 42  -- encodes A, B
    col 1; len 2; (2):  c1 19
    col 2; len 2; (2):  c1 42
    col 3; len 6; (6):  00 75 81 d9 00 01
    row#2[7978] flag: -------, lock: 2, len=18
    col 0; len 2; (2):  43 44  -- encodes C, D
    col 1; len 2; (2):  c1 19
    col 2; len 2; (2):  c1 40
    col 3; len 6; (6):  00 75 81 d9 00 02
    row#3[7960] flag: -------, lock: 2, len=18
    col 0; len 2; (2):  43 44  -- encodes C, D
    col 1; len 2; (2):  c1 19
    col 2; len 2; (2):  c1 42
    col 3; len 6; (6):  00 75 81 d9 00 03
    row#4[7942] flag: -------, lock: 2, len=18
    col 0; len 2; (2):  45 46  -- encodes E, F
    col 1; len 2; (2):  c1 19
    col 2; len 2; (2):  c1 40
    col 3; len 6; (6):  00 75 81 d9 00 04
    

    Dump of reverse index:

    row#0[8014] flag: -------, lock: 0, len=18
    col 0; len 2; (2):  42 41  -- encodes B, A
    col 1; len 2; (2):  19 c1  -- encodes the number 24 *backwards*
    col 2; len 2; (2):  40 c1  -- encodes the number 63 *backwards*
    col 3; len 6; (6):  00 75 81 d9 00 00
    row#1[7996] flag: -------, lock: 0, len=18
    col 0; len 2; (2):  42 41  -- encodes B, A
    col 1; len 2; (2):  19 c1
    col 2; len 2; (2):  42 c1
    col 3; len 6; (6):  00 75 81 d9 00 01
    row#2[7978] flag: -------, lock: 0, len=18
    col 0; len 2; (2):  44 43  -- encodes D, C
    col 1; len 2; (2):  19 c1
    col 2; len 2; (2):  40 c1
    col 3; len 6; (6):  00 75 81 d9 00 02
    row#3[7960] flag: -------, lock: 0, len=18
    col 0; len 2; (2):  44 43  -- encodes D, C
    col 1; len 2; (2):  19 c1
    col 2; len 2; (2):  42 c1
    col 3; len 6; (6):  00 75 81 d9 00 03
    row#4[7942] flag: -------, lock: 0, len=18
    col 0; len 2; (2):  46 45  -- encodes F, E
    col 1; len 2; (2):  19 c1
    col 2; len 2; (2):  40 c1
    col 3; len 6; (6):  00 75 81 d9 00 04
    

    Note that the byte order has been reversed for both strings and numbers. However Oracle would store the datatype, it simply reverses the byte order. The only exception is the ROWID itself (col 3 in the dump) isn't reversed, obviously. The order of columns relative to each other hasn't changed.

    The purpose of reverse indexes is to spread inserts out across the leaf nodes so there isn't a hot right-leading edge when you have every-increasing column values inserted in order. This can help reduce contention if you're experiencing it. There would be no benefit to reversing the column order (you can always reorder the columns yourself if you wished). The downside to reverse indexes is, because the bytes are stored in reverse order, they can only be used for equality predicates, not inequalities like >, <, BETWEEN, LIKE (it can do a full scan, but not range scan/binary search on the starting value). That makes it appropriate for identifiers that have no sequential meaning (like customer IDs), but not for things like dates, numbers reflecting quantities, strings that may be pattern matched, etc. The same benefit can be accomplished by hash partitioning the table on a unique column and making the index local. That too will spread out inserts across many index blocks while still retaining the ability to do inequality searches. Neither approach, however, should be used unless hot-block contention is actually a problem.