Search code examples
sqloracle-databaseoracle11gpartitioningdatabase-partitioning

Query the exact partition when table is hash-partitioned


When I want to query a single partition I usually use something like that:

Select * from t (partition p1)

But when you have to query it in your pl/sql code it comes to using execute immediate and hard-parse of the statement.

Okay, for RANGE-partitioned table (let it be SOME_DATE of date type) I can workaround it like

Select * from t where some_date <= :1 and some_date > :2

Assuming :1 and :2 stand for partition bonds.

Well, as for LIST-partitioned table I can easily specify the exact value of my partition key field like

Select * from t where part_key = 'X'

And what about HASH-partitioning? For example, I have a table partitioned by hash(id) in 16 partitions. And I have 16 jobs each handling its own partition. So I have to use it like that

Select * from t (partition p<n>)

Question is: can I do it like this for example

Select * from t where hash(id) = :1

To enforce partition pruning take the whole n-th partition?

It's okay when you have just 16 partitions but in my case I have composite partitioning (date + hash(id)), so every time job handles a partition it's always a new sql_id and it ends up in quick shared pool growth


Solution

  • It appears Oracle internally uses the ora_hash function (at least since 10g) to assign a value to a partition. So you could use that to read all the data from a single partition. Unfortunately, though, since you'd be running a query like

    select *
      from t
     where ora_hash( id, 9 ) = 6
    

    to get all the data in the 6th of 8 hash partitions, I'd expect Oracle to have to read every partition in the table (and compute the hash on every id) because the optimizer isn't going to be smart enough to recognize that your expression happens to map exactly to its internal partitioning strategy. So I don't think you'd want to do this to split data up to be processed by different threads.

    Depending on what those threads are doing, would it be possible to use Oracle's built-in parallelism instead (potentially incorporating things like parallelizable pipelined table functions if you're doing ETL processing). If you tell Oracle to use 16 parallel threads and your table has 16 partitions, Oracle will internally almost certainly do the right thing.