Search code examples
pythonscikit-learntransformdata-fittingscalar

scalling new values after fit_transform


I have the following features :

 array([[290.,  50.],
       [290.,  46.],
       [285.,  44.],
       ...,
       [295.,  46.],
       [299.,  46.],
       [  0.,   0.]])

after transforming it with:

from sklearn.preprocessing import StandardScaler
self.scaler = StandardScaler()  
self.scaled_features = self.scaler.fit_transform(self.features)

I have scaled_features:

array([[ 0.27489919,  0.71822864],
       [ 0.27489919,  0.26499222],
       [ 0.18021955,  0.03837402],
       ...,
       [ 0.36957884,  0.26499222],
       [ 0.44532255,  0.26499222],
       [-5.2165202 , -4.94722653]])

Now I wish to get a sample from the self.scaler so I send my new feature example t:

t = [299.0, 46.0] 
new_data = np.array(t).reshape(-1, 1)
new_data_scaled = self.scaler.transform(t)

I get

non-broadcastable output operand with shape (2,1) doesn't match the broadcast shape (2,2)

what am I doing wrong? Why new_data is not scalled?


Solution

  • There are two things, first you are putting the list t into transform and not new_data. Second, new_date has shape (2,1) but should have shape (1,2). So if you change it to

    t = [299.0, 46.0] 
    new_data = np.array(t).reshape(1, -1)
    new_data_scaled = self.scaler.transform(new_data)
    

    you should get scaled data.