Search code examples
javamachine-learningclassificationwekaresampling

What is the difference between supervised and unsupervised reampling in WEKA?


I would like to know what is the difference between weka.filters.supervised.instance.Resample and weka.filters.unsupervised.instance.Resample? and in which cases should we use each one?


Solution

  • The documentation for both supervised and unsupervised resampling is the same except that the documentation for supervised resampling has the additional sentence:

    The filter can be made to maintain the class distribution in the subsample, or to bias the class distribution toward a uniform distribution.

    Supervised resampling also has the extra Parameter:

    -B <num>
    Bias factor towards uniform class distribution.
    0 = distribution in input data  
    1 = uniform distribution.
    (default 0)
    

    So, supervised resampling only applies when there is a class variable. When fully biased towards the input distribution (B=0), each subsample replicates the class distribution of the full data set. B=1 is equivalent to the unsupervised resampling where points are drawn uniformly from the whole population without regard to the class.