Search code examples
pythonpandassklearn-pandas

How to Use StandardScaler and 'transform()' method to apply scaling to train and test split (Completely lost)


#Code task 10

#Call the StandardScalers fit method on X_tr to fit the scaler

#then use it's transform() method to apply the scaling to both the train and test split

#data (X_tr and X_te), naming the results X_tr_scaled and X_te_scaled, respectively

scaler = StandardScaler()
scaler.fit_transform(X_tr)
X_tr_scaled = scaler.transform(X_tr)
X_te_scaled = scaler.transform(X_te)

This was the code that I used but I get a

RunTimeWarning: invalid value encountered in true_divide

and

RunTimeWarning: Degrees of Freedom <= 0 for slice. result=op(x, *args, **kwargs)

I tried looking up online resources which was how I arrived at my code but the problem says for me to use transform() but it did not work at all whereas fit_transform at least gave me an output.

I don't understand a thing about this and why I get the RunTimeError. If anyone can provide any explanation, article or pdf that walks me through Sklearn or why I get my error I would greatly appreciate it.


Solution

  • You don't want to fit_transform() and then transform() again.

    Try to fit the scaler with training data, then to transform both training and testing datasets as follows:

    scaler = StandardScaler().fit(X_tr)
    X_tr_scaled = scaler.transform(X_tr)
    X_te_scaled = scaler.transform(X_te)
    

    Let me know if it worked!