Search code examples
database-designteradatasizingcapacity-planningphysical-design

teradata data block size and table size calculations


Been using this formula as a std for TB size calculation , as part of Cap. Planning effort . We are on TD 14

     ( rc *   ( rsz / ( blocksize -38) ) * blocksize ) 
      + ( SQL (sel Hashamp()+1 ; ) * 1024 ) 

rsz : row size , rc : count ( * ) 

Here actually

 (blocksize-38)/rsz 

is nothing but rows / block. It comes out a fraction < 1. I think that's bad because it'd mean several blocks span a row. My questions are

  • Do the formulas need any further honing. Latter part after the addition sign , provides for table header . There are NO SI's for this table - just 2 dates , 1 Integer and 1 varchar (50) with a NUPI which is NPPI . None of them are Nullable and obviously without the data , nothing is compressible to begin with ( well there's not enough data info to inc. compression now but we'd run compression scripts later )
  • because it'd be several blocks spanning a row - I should be upping the block size ? how much - what should be the ideal number of rows per block. Table data will get a full refresh every quarter and in there'd be nothing happening in the interim

Solution

  • A row in Teradata never spans blocks.

    You simply got your calculation wrong, you talk about (blocksize-38)/rsz, but the actual calculation shows rsz / ( blocksize -38).

    As the block overhead in newer releases increased to 74 this should be the correct calculation:

     ( rc /  (( blocksize - 74)/rsz ) * blocksize ) 
          + ( (HASHAMP()+1  ) * 1024 ) 
    

    It's found in Sizing Base Tables, Hash Indexes, and Join Indexes

    But you will notice that this approaches rc * rsz for larger tables. As nobody cares about small tables, I usually use this simplified calculation to size a table (you might add 1 or 2 percent to get a maximum possible size).

    Edit:

    Not the calculation is wrong, it's due to base data types used (probably a truncation of a DECIMAL). Change to a FLOAT or NUMBER:

     ( rc *   ( rsz / ( CAST(blocksize -74 AS FLOAT)) ) * blocksize ) 
      + ( (HASHAMP()+1  ) * 1024 )