Search code examples
sqloracle-databaseoracle11gpartitioningdatabase-partitioning

Hash partitioning in oracle SQL


I have a table like this one:

CREATE TABLE "TS1" 
       (    
        "ID" VARCHAR2(32 BYTE) NOT NULL, 
        "CID" VARCHAR2(70 BYTE) NOT NULL, 
        "PID" VARCHAR2(21 BYTE) NOT NULL, 
        "LASTUSAGE" TIMESTAMP (6) NOT NULL, 
        "CREATIONTIME" TIMESTAMP (6) NOT NULL, 
        "COSTCENTER" NUMBER NOT NULL
       );

ALTER TABLE "TS1" ADD CONSTRAINT "TS1_PRIMARY" PRIMARY KEY ("ID", "CID", "PID");

I tried to find a good way to partition the table considering:

  • I have no query that use creationTime in where clause ( So range partition maybe is not the best solution on this field)
  • LastUsage is updated very often (so range partition maybe is not the best solution on this field)
  • Most of the queries uses ID, CID, PID in where clause

So standing this, a good option should be HASH PARTITION on ID,CID,PID.

CREATE TABLE "TS1" 
       (    
        "ID" VARCHAR2(32 BYTE) NOT NULL, 
        "CID" VARCHAR2(70 BYTE) NOT NULL, 
        "PID" VARCHAR2(21 BYTE) NOT NULL, 
        "LASTUSAGE" TIMESTAMP (6) NOT NULL, 
        "CREATIONTIME" TIMESTAMP (6) NOT NULL, 
        "COSTCENTER" NUMBER NOT NULL
       )       
PARTITION BY HASH ("ID", CID, PID)
PARTITIONS N;  --N = number of partitions


ALTER TABLE "TS1" ADD CONSTRAINT "TS1_PRIMARY" PRIMARY KEY ("ID", "CID", "PID");

Is it a problem if I'm partitioning by hash using the primary key as parameter? Let's suppose to have lot of records in table TS1 (millions) I will receive some perfomance benefits from this partitioning?


Solution

  • "Most of the queries uses ID, CID, PID in where clause"

    This means most queries are single row lookups on the primary key, so there is no way partition elimination can make things faster. All it might do is make those few queries which don't use the key slower (because say reads using an index range scan might not be as performative).

    There are three reasons to implement Partitioning. They are:

    • data management. We can load data into a single partition using partition exchange, or zap data using drop or truncate partition with no impact on the rest of the table.
    • availability. We can have a separate tablespace for each partition which localises the impact of datafile corruption or similar.
    • performance. Queries which work with the grain of the partitioning key may benefit from partition pruning. Queries which might benefit are those which will execute a range scan; if we load a million rows into a table each day and we generally want to retrieve records for a given day we would get a lot of benefit from partitioning by day.
    • concurrent DML. If our application has a large number of users inserting, changing and deleting records we may have e.g. waits for ITL slots or latch contention, some times known. hot blocks. Hash partitioning can help here, by distributing inserts and hence all other activity across the whole table.

    Partitioning by a hash of ("ID", CID, PID) won't help you with performance, if the usage profile is as you describe. Nor will it give you any data management advantage. It seems unlikely you're interested in the availability benefits (because millions of rows seems too small a number to worry about).

    So that leaves concurrent DML. If the performance problem you are trying to solve is writing rather than reading and the pattern of concurrent activity aligns with some aspect of the primary key (say most DML is for the newest rows) then perhaps hash partitioning will alleviate the latch contention. If that sounds like your situation you should test Partitioning in an environment with Production-like volumes of data and Production levels of activity. (Not always easy to do.)

    Otherwise Partitioning seems like a solution in search of a problem.