Search code examples
rmulticlass-classification

Any package in R which can Do Multi Class ,Oversampling,Under sampling,Both And SMOTE?


I am looking for Packages which can do multiclass oversampling, Undersampling or both techniques. I tried using ROSE package but it works only for binary class.

my target variable has 4 class and there % are. "0"-70% "1"-15% "2"-10% "3"-5% "4"-5%


Solution

  • You can try SMOTE. SMOTE over or under samples the data by generating the observations if needed.So, ,most of the times, smote out performs any other sampling technique. This is a snippet of code in python.In R,it is a little hard to equalize the level distribution of target variable using SMOTE, but can be done considering 2 classes at a time

    from imblearn.over_sampling import SMOTE
    sm = SMOTE(random_state=99, ratio = 1.0)
    x_train, y_train = sm.fit_sample(X_var, target_class)
    print(pandas.value_counts(y_train))#verify class distribution here
    

    ratio is hyper parameter here.

    Hope this helps.