I read this post about feature scaling: all-about-feature-scaling
The two main feature scaling techniques are:
min-max scaler
- which responds well for features with distributions which are not Gaussian.
Standard scaler
- which responds well for features with Gaussian distributions.
I read other posts and examples, and it seems that we always use one scaling method (min-max
or standard
) for all the features.
I haven't seen example or paper that suggests:
1. go over all the features, and for each feature:
1.1 check feature distribution
1.2 if the feature distribution is Gaussian:
1.2.1 use Standard scaler for this feature
1.3 otherwise:
1.3.1 use min-max scaler for this feature
Why we are not mixing the scaling methods ?
What is wrong or disadvantages with my proposal ?
Then, your features will have different scales, which is a problem because the features with the larger scale will dominate the rest (e.g., in KNN). The features with min-max normalization will be rescaled into a [0,1] range, while the ones with standardization will be transformed into a negative to positive range (e.g., [-2,+2] or even wider in the event of small standard deviations).
import pandas as pd
from sklearn.preprocessing import MinMaxScaler, StandardScaler
dfTest = pd.DataFrame({'A':[14,90,80,90,70],
'B':[10,107,110,114,113]})
scaler = MinMaxScaler()
dfTest['A'] = scaler.fit_transform(dfTest[['A']])
scaler = StandardScaler()
dfTest['B'] = scaler.fit_transform(dfTest[['B']])
ax = dfTest.plot.scatter('A', 'B')
ax.set_aspect('equal')