Search code examples
pythonscikit-learnnormalization

Difference between Normalizer and MinMaxScaler


I'm trying to understand the effects of applying the Normalizer or applying MinMaxScaler or applying both in my data. I've read the docs in SKlearn, and saw some examples of use. I understand that MinMaxScaler is important (is important to scale the features), but what about Normalizer?

It keeps unclear to me the practical result of using the Normamlizer in my data.

MinMaxScaler is applied column-wise, Normalizer is apllied row-wise. What does it implies? Should I use the Normalizer or just use the MinMaxScale or should use then both?


Solution

  • As you have said,

    MinMaxScaler is applied column-wise, Normalizer is applied row-wise.

    Do not confuse Normalizer with MinMaxScaler. The Normalizer class from Sklearn normalizes samples individually to unit norm. It is not column based but a row-based normalization technique. In other words, the range will be determined either by rows or columns.

    So, remember that we scale features not records, because we want features to have the same scale, so the model to be trained will not give different weights to different features based on their range. If we scale the records, this will give each record its own scale, which is not what we need.

    So, if features are represented by rows, then you should use the Normalizer. But in most cases, features are represented by columns, so you should use one of the scalers from Sklearn depending on the case:

    • MinMaxScaler transforms features by scaling each feature to a given range. It scales and translates each feature individually such that it is in the given range on the training set, e.g. between zero and one. The transformation is given by:

      X_std = (X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0))
      
      X_scaled = X_std * (max - min) + min
      

      where min, max = feature_range.

      This transformation is often used as an alternative to zero mean, unit variance scaling.

    • StandardScaler standardizes features by removing the mean and scaling to unit variance. The standard score of a sample x is calculated as: z = (x - u) / s. Use this if the data distribution is normal.

    • RobustScaler is robust to outliers. It removes the median and scales the data according to IQR (Interquartile Range). The IQR is the range between the 25th quantile and the 75th quantile.