Search code examples
machine-learningdata-sciencefeature-scaling

Is it right to use different feature scaling techniques to different features?


I read this post about feature scaling: all-about-feature-scaling

The two main feature scaling techniques are:

  1. min-max scaler - which responds well for features with distributions which are not Gaussian.

  2. Standard scaler - which responds well for features with Gaussian distributions.

I read other posts and examples, and it seems that we always use one scaling method (min-max or standard) for all the features.

I haven't seen example or paper that suggests:

1. go over all the features, and for each feature:
1.1 check feature distribution
1.2 if the feature distribution is Gaussian:
1.2.1 use Standard scaler for this feature
1.3 otherwise:
1.3.1 use min-max scaler for this feature
  1. Why we are not mixing the scaling methods ?

  2. What is wrong or disadvantages with my proposal ?


Solution

  • Then, your features will have different scales, which is a problem because the features with the larger scale will dominate the rest (e.g., in KNN). The features with min-max normalization will be rescaled into a [0,1] range, while the ones with standardization will be transformed into a negative to positive range (e.g., [-2,+2] or even wider in the event of small standard deviations).

    import pandas as pd
    from sklearn.preprocessing import MinMaxScaler, StandardScaler
    
    dfTest = pd.DataFrame({'A':[14,90,80,90,70],
                           'B':[10,107,110,114,113]})
    
    scaler = MinMaxScaler()
    dfTest['A'] = scaler.fit_transform(dfTest[['A']])
    
    scaler = StandardScaler()
    dfTest['B'] = scaler.fit_transform(dfTest[['B']])
    
    ax = dfTest.plot.scatter('A', 'B')
    ax.set_aspect('equal')
    

    enter image description here