Search code examples
sqloracle-databasepartitioningdatabase-performancesqlperformance

Oracle - is daily partitioning approach good for table with millions of rows


I am working on a existing table which has millions of rows of data, in terms of retail transactions history. the schema looks like below :

create table History(
hid number(19,0),
type varchar2(255 char),
lastupdated timestamp (6) not null enable,
name varchar2(255 char),
primary key (hid))
partition by range (lastupdated) interval (numtodsinterval(1,'day'))
(partition retailhistory values less than (to_timestamp('12/01/2020','DD/MM/YYYY')));

The above table is partitioned based on daily interval partitioning, So there would be 365 partitions in a year consisting millions of rows. We are planning to create a purging job but until then will it be ok to have this much partitions in a table, and will it cause performance degrade compared to a table with no partitions?

Any help would be appreciated. Thanks. :)


Solution

  • Depending on your definition of "millions", daily partitioning may be too granular and may cause performance problems when reading data.

    Each partition is physically stored like a table and has different overheads - most importantly the overhead for segment space allocation. Oracle almost never allocates exactly how much space is needed, there's always a little extra. If you create a large number of tiny partitions, that "little extra" can be larger than the actual data.

    In my below test case, assuming 5 million rows per year, and a relatively small value in each of the columns, daily partitioning uses ten times more segment space than monthly partitioning. This means daily partitioning will be optimal for selecting a single day of data, but will be terrible for selecting multiple days of data. And since full table/partition scans use multi-block reads, and read megabytes of data at a time anyway, reading a whole month of data may not be much slower than reading a day of data anyway.

    Sample Schema

    Create three tables for daily, monthly, and no partitioning. Load 5 million rows equally into each of the 365 days, and then check the segment sizes.

    • Daily partitioning uses 2,920 megabytes for 365 segments.
    • Monthly partitioning uses 288 megabytes for 13 segments.
    • No partitioning uses 280 megabytes for 1 segment.

    Code:

    ----------------------------------------
    --DAY
    ----------------------------------------
    
    create table History_day(
    hid number(19,0),
    type varchar2(255 char),
    lastupdated timestamp (6) not null enable,
    name varchar2(255 char),
    primary key (hid))
    partition by range (lastupdated) interval (numtodsinterval(1,'day'))
    (partition retailhistory values less than (to_timestamp('12/01/2020','DD/MM/YYYY')));
    
    create sequence history_day_seq;
    
    begin
        for i in 1 .. 365 loop
            for j in 1 .. 13698 loop
                insert into history_day values(history_day_seq.nextval, 'some type value', date '2020-12-01' + i, 'some name value');
            end loop;
        end loop;
        commit;
    end;
    /
    
    select sum(bytes)/1024/1024 mb, count(*) partition_count from dba_segments where segment_name = 'HISTORY_DAY';
    
    
    ----------------------------------------
    --MONTH: 288 megabytes for 13 partitions.
    ----------------------------------------
    
    create table History_month(
    hid number(19,0),
    type varchar2(255 char),
    lastupdated timestamp (6) not null enable,
    name varchar2(255 char),
    primary key (hid))
    partition by range (lastupdated) interval (numtoyminterval(1,'month'))
    (partition retailhistory values less than (to_timestamp('12/01/2020','DD/MM/YYYY')));
    
    create sequence history_month_seq;
    
    begin
        for i in 1 .. 365 loop
            for j in 1 .. 13698 loop
                insert into history_month values(history_month_seq.nextval, 'some type value', date '2020-12-01' + i, 'some name value');
            end loop;
        end loop;
        commit;
    end;
    /
    
    select sum(bytes)/1024/1024 mb, count(*) partition_count from dba_segments where segment_name = 'HISTORY_MONTH';
    
    
    ----------------------------------------
    --NO PARTITIONS
    ----------------------------------------
    
    create table History(
    hid number(19,0),
    type varchar2(255 char),
    lastupdated timestamp (6) not null enable,
    name varchar2(255 char),
    primary key (hid));
    
    create sequence history_seq;
    
    begin
        for i in 1 .. 365 loop
            for j in 1 .. 13698 loop
                insert into history values(history_seq.nextval, 'some type value', date '2020-12-01' + i, 'some name value');
            end loop;
        end loop;
        commit;
    end;
    /
    
    select sum(bytes)/1024/1024 mb, count(*) partition_count from dba_segments where segment_name = 'HISTORY';
    

    Do these results apply to you?

    There's a good chance your results will be different on your system, depending on your data and how your system allocates segments. The important thing is that you should perform a test like this yourself. And keep in mind that partitions are designed for "large" amounts of data - but the word "large" is clearly subjective.

    Daily partitioning is perfect for many large tables, but I have a feeling that your table would work better with month partitioning.