Search code examples
machine-learningnormalizationsklearn-pandas

Stanardize only a few selected columns in machine learning


I have a CSV file out of which only a few columns need normalization(others are binary values). Should I selectively normalize the required columns or should I normalize all the columns in the table? If I normalize the entire table will I lose some information or noise is introduced into data that don't require any normalization or standardization?


Solution

  • Let's make some points clear.

    • Binary Data is a categorical data (IsEmployed - 0/1)
    • Only numerical data has to be normalized

    Understanding part:

    • When we say normalized data, it means we are moving distribution scale to range of 0-1.

    +Added:

    • In perspective of Categorical data, we do OneHotEncoding and convert it back to binary data for each category.