Search code examples
sqldatabasedatabase-designteradatatable-statistics

teradata collect stats using sample doesnt quite work


trying to collect sample 2 pct multi column stats on a VT Problem is , it does not sample or I am missing something

COLLECT STATISTICS USING sample 2 percent 
        COLUMN ( C1,C2,C3) , 
        COLUMN ( C1 ) on vt  , 

Volatile table VT has a few Billion rows and it takes just as much "forever" time to collect stats on VT as would be a regular stats collection.
Also elsewhere , in a similar situation in past , I observed if I use the 'new' ( many cols in one ) stats collection syntax , it'd sample the 1st col and ignore the rest ( now this 2nd part isn't verified all the way since I based it more out of a 'time taken' hunch as against peeping into the histo. )

    show  stats values  
            COLUMN ( C1,C2,C3 ) , 
            COLUMN ( C1 )  on vt;

COLLECT STATISTICS 
            COLUMN ( C1,C2,C3 )
                ON  vt 
            VALUES
(
 /** SummaryInfo **/ 
 /* Data Type and Length: 'I8:8', 'I:4', 'CF:1' */
 /* TimeStamp             */ TIMESTAMP '2016-03-29 16:16:13-00:00', 
 /* Version               */ 6, 
 /* OriginalVersion       */ 6, 
 /* DBSVersion            */ '14.00.00', 
 /* UsageType             */ 'D', 
 /* ComplexStatInfo       */ 'ComplexStatInfo', 
 /* NumOfBiasedValues     */ 0, 
 /* NumOfEHIntervals      */ 200, 
 /* NumOfHistoryRecords   */ 1, 
 /* SamplePercent         */ 0.00, 
 /* NumOfNulls            */ 0, 
 /* NumOfAllNulls         */ 0, 
 /* NumOfPartialNullVals  */ 0, 
 /* PartialNullHMF        */ 0, 
 /* AvgAmpRPV             */ 0.000000, 
 /* MinVal                */ 2012070201942549261, 221945585, 'P', 
 /* MaxVal                */ 2014022922306867633, 277308727, 'P', 
 /* ModeVal               */ 2012070201942549261, 221945585, 'P', 
 /* HighModeFreq          */ 1, 
 /* NumOfDistinctVals     */ 1201056221, 
 /* NumOfRows             */ 1201056221, 
 /* CPUUsage              */ 0.000000, 
 /* IOUsage               */ 0.000000, 
 /* Reserved              */ 0, 
 /* Reserved              */ 0, 
 /* Reserved              */ 0.000000, 
 /* Reserved              */ 0.000000, 
 /* Reserved              */ '', 
 /* StatsSkipCount        */ 0, 
 /* SysInsertCnt          */ 0, 
 /* SysDeleteCnt          */ 0, 
 /* SysUpdateCnt          */ 0, 
 /* SysInsDelLastResetTS  */ TIMESTAMP '9999-12-31 23:59:59-00:00', 
 /* SysUpdLastResetTS     */ TIMESTAMP '9999-12-31 23:59:59-00:00', 
 /* IsSampleFollowingTrend*/ 0, 
 /** Interval: MaxVal[3], ModeVal[3], ModeFreq, LowFreq, OtherVals, OtherRows **/ 

here

/* SamplePercent */ 0.00,
is what I am going by . I am sure I am missing something.


Solution

  • Is this /* DBSVersion */ '14.00.00', correct? You seem to run a very old release, is this a TD Express?

    Well, if you got a 5,000,000,000 row table, this /* NumOfRows */ 1201056221, seems to indicates approx. 25%, that's strange, too.

    If you Explain your COLLECT STATS you can see if it's actually sampling or not.

    Can you add the table's DDL?