Search code examples
pythonneural-networktraining-datastandard-deviation

Scale training set by inverse standard deviation using python


I am using python and have a training set of data that i need to 'subtract the mean and scale by inverse standard deviation'. Subtracting the mean would just be subtracting the mean from each value in each column i assume, but i have no idea what i am meant to do when it says to 'scale by inverse standard deviation'.

I have googled it but nothing has come up in relation to python or neural networks so i'm not sure how to continue.

Thanks

EDIT: Would this be correct?

scaled_train =  (train - train_mean) / train_std_deviation

Solution

  • In the future these questions are better for CrossValidated.

    Let your dataset be x then

    import numpy as np
    x = np.array(x)
    x -= np.mean(x)
    x /= x.std()
    

    This is called Standardization

    This can be achieved with sklearn as per the docs for

    >>> from sklearn import preprocessing
    >>> import numpy as np
    >>> X_train = np.array([[ 1., -1.,  2.],
    ...                     [ 2.,  0.,  0.],
    ...                     [ 0.,  1., -1.]])
    >>> X_scaled = preprocessing.scale(X_train)
    

    References