Search code examples
databaseoracle-databaseindexingdatabase-partitioning

Best combination of indexes and partitions for large Oracle tables with many lookups?


I have an Oracle 10g database of genomic data with several >100 million row tables that look similar to the following:

ID    AssayID    Sample      Mutation    Call Frequency
101   12578      Sample01    T367G       P    0.87
102   31384      Sample01    A2345C      A    0.28
103   3453       Sample01    T247C       P    0.67
104   12578      Sample02    G235del     M    0.11
105   7868       Sample02    None        P    0.98
  • ID is a unique PK, AssayID and Sample are foreign keys.
  • Assume that for each Sample value, there are ~50k rows.
  • Each AssayID occurs exactly once per Sample.
  • Mutation is relatively random and Call can be one of three values.
  • Queries on this table can use any one or a combination of the AssayID, Sample, Mutation, Call, or a value in a linked table via AssayID and Sample.

A typical query:

select t.*
from this_table t
    join assay_table a on t.assayid = a.assayid
    join sample_table s on t.sample = s.sample
where 
    s.name = 'xxx' and a.gene in ('abc', 'xyz') and t.call = 'P'
  • Queries against these tables always join multiple smaller tables.
  • The WHERE statement will usually filter data on multiple columns, but never from only the base data table.

How do I design the table to get the best query performance when selecting all columns?
Do I use indexes only, partitions only, or a combination of the two? Disk space and insert/update performance is not an issue.


Solution

  • After creating numerous test copies of tables with different combinations of indexes and partitions, and running a battery of performance analyses using a broad selection of queries, I don't think that there is a single, simple answer for this question. Each situation is different, and the scope of this question is too broad for this forum. Thank you everyone for the feedback, it was all helpful.