Search code examples
pythonpython-3.xscikit-learnresamplingimblearn

Difference between over sampling and upsampling and between SMOTE and over_sampling.SMOTE?


This question is a bit of paranoia, as in google the search results gets mixed by the audio and Fourier transform etc.

  1. Specifically for python, when it comes to numeric data, is there a difference between oversampling and upsampling of the minority class?

  2. I am using imblearn to oversample/upsample a minority class. I am currently using

    from imblearn.over_sampling import SMOTE
    
    sm = SMOTE(random_state=12, ratio = 1.0)
    x_train_res, y_train_res = sm.fit_sample(X_train, y_train)
    

    but more recently, I came across

    sm = over_sampling.SMOTE(random_state=12, ratio = 1.0)
    x_train_res, y_train_res = sm.fit_sample(X_train, y_train)
    

    What is the difference?


Solution

  • from imblearn.over_sampling import SMOTE
    
    sm = SMOTE(random_state=12, ratio = 1.0)
    

    and

    import imblearn.over_sampling 
    
    sm = over_sampling.SMOTE(random_state=12, ratio = 1.0)
    

    Is identical. The only difference is how you access the SMOTE function in your code.