Search code examples
pythonmachine-learninglasso-regression

Can I inverse transform the intercept and coefficients of LASSO regression after using Robust Scaler?


Is it possible to inverse transform the intercept and coefficients in LASSO regression, after fitting the model on scaled data using Robust Scaler?

I'm using LASSO regression to predict values on data that is not normalized and doesn't perform well with LASSO unless it's scaled beforehand. After scaling the data and fitting the LASSO model, I ideally want to be able to see what the model intercept and coefficients are but in the original units (not the scaled versions). I asked a similar question here and it doesn't appear this is possible. If not, why? Can someone explain this to me? I'm trying to broaden my understanding of how LASSO and Robust Scaler work.

Below is the code I was using. Here I was trying to inverse transform the coefficients using transformer_x and the intercept using transformer_y. However, it sounds like this is incorrect.

import pandas as pd
from sklearn.preprocessing import RobustScaler
from sklearn.linear_model import Lasso

df = pd.DataFrame({'Y':[5, -10, 10, .5, 2.5, 15], 'X1':[1., -2.,  2., .1, .5, 3], 'X2':[1, 1, 2, 1, 1, 1], 
              'X3':[6, 6, 6, 5, 6, 4], 'X4':[6, 5, 4, 3, 2, 1]})

X = df[['X1','X2', 'X3' ,'X4']]
y = df[['Y']]

#Scaling 
transformer_x = RobustScaler().fit(X)
transformer_y = RobustScaler().fit(y) 
X_scal = transformer_x.transform(X)
y_scal = transformer_y.transform(y)

#LASSO
lasso = Lasso()
lasso = lasso.fit(X_scal, y_scal)

def pred_val(X1,X2,X3,X4): 

    print('X1 entered: ', X1)

    #Scale X value that user entered - by hand
    med_X = X.median()
    Q1_X = X.quantile(0.25)
    Q3_X = X.quantile(0.75)
    IQR_X = Q3_X - Q1_X
    X_scaled = (X1 - med_X)/IQR_X
    print('X1 scaled by hand: ', X_scaled[0].round(2))

    #Scale X value that user entered - by function
    X_scaled2 = transformer_x.transform(np.array([[X1,X2]]))
    print('X1 scaled by function: ', X_scaled2[0][0].round(2))

    #Intercept by hand
    med_y = y.median()
    Q1_y = y.quantile(0.25)
    Q3_y = y.quantile(0.75)
    IQR_y = Q3_y - Q1_y
    inv_int = med_y + IQR_y*lasso.intercept_[0]

    #Intercept by function
    inv_int2 = transformer_y.inverse_transform(lasso.intercept_.reshape(-1, 1))[0][0]

    #Coefficient by hand
    inv_coef = lasso.coef_[0]*IQR_y 

    #Coefficient by function 
    inv_coef2 = transformer_x.inverse_transform(reg.coef_.reshape(1,-1))[0]

    #Prediction by hand
    preds = inv_int + inv_coef*X_scaled[0]

    #Prediction by function 
    preds_inner = lasso.predict(X_scaled2)  
    preds_f = transformer_y.inverse_transform(preds_inner.reshape(-1, 1))[0][0]

    print('\nIntercept by hand: ', inv_int[0].round(2))
    print('Intercept by function: ', inv_int2.round(2))
    print('\nCoefficients by hand: ', inv_coef[0].round(2))
    print('Coefficients by function: ', inv_coef2[0].round(2))
    print('\nYour predicted value by hand is: ', preds[0].round(2))
    print('Your predicted value by function is: ', preds_f.round(2))
    print('Perfect Prediction would be 80')

pred_val(10,1,1,1)

Update: I've updated my code to show the type of prediction function I'm trying to create. I'm just trying to create a function that does exactly what .predict does, but also shows the intercept and coefficients in their unscaled units.

Current output:

Out[1]:
X1 entered:  10
X1 scaled by hand:  5.97
X1 scaled by function:  5.97

Intercept by hand:  34.19
Intercept by function:  34.19

Coefficients by hand:  7.6
Coefficients by function:  8.5

Your predicted value by hand is:  79.54
Your predicted value by function is:  79.54
Perfect Prediction would be 80

Ideal output:

Out[1]:
X1 entered:  10
X1 scaled by hand:  5.97
X1 scaled by function:  5.97

Intercept by hand:  34.19
Intercept by function:  34.19

Coefficients by hand:  7.6
Coefficients by function:  7.6

Your predicted value by hand is:  79.54
Your predicted value by function is:  79.54
Perfect Prediction would be 80

Solution

  • Based on the linked SO thread, all you want to do is to get the unscaled prediction value. Is that right?

    If yes, then all you need to do is:

    # Scale the test dataset
    X_test_scaled = transformer_x.transform(X_test)
    
    # Predict with the trained model
    prediction = lasso.predict(X_test_scaled)
    
    # Inverse transform the prediction
    prediction_in_dollars = transformer_y.inverse_transform(prediction)
    

    UPDATE:

    Suppose the train data contain just a single feature named X. Here is what the RobustScaler will do:

    X_scaled = (X - median(X))/IQR(X)
    y_scaled = (y - median(y))/IQR(y)
    

    Then, the lasso regression will give a prediction like this:

    a * X_scaled + b = y_scaled
    

    You have to work out the equations to see what model coefficient on the unscaled data:

    # Substituting X_scaled and y_scaled from the 1st equation
    # In this equation `median(X), IQR(X), median(y) and IQR(y) are plain numbers you already know from the training phase
    a * (X - median(X))/IQR(X) + b = (y - median(y))/IQR(y)
    

    If you try to make a a_new * x + b_new = y-like equation out of this, you end up with:

    a_new = (a * (X - median(X)) / (X * IQR(X))) * IQR(y)
    b_new = b * IQR(y) + median(y)
    a_new * X + b_new = y
    

    You can see that the unscaled coefficient (a_new) depends on X. So, you can use the unscaled X to make predictions directly but in between you are applying the transformation indirectly.

    UPDATE 2

    I've adapted your code and it now shows how you can get the coefficients in the original scale. The script is just the implementation of the formulas I'm showing above.

    import pandas as pd
    import numpy as np
    from sklearn.preprocessing import RobustScaler
    from sklearn.linear_model import Lasso
    
    df = pd.DataFrame({'Y':[5, -10, 10, .5, 2.5, 15], 'X1':[1., -2.,  2., .1, .5, 3], 'X2':[1, 1, 2, 1, 1, 1],
                  'X3':[6, 6, 6, 5, 6, 4], 'X4':[6, 5, 4, 3, 2, 1]})
    
    X = df[['X1','X2','X3','X4']]
    y = df[['Y']]
    
    #Scaling
    transformer_x = RobustScaler().fit(X)
    transformer_y = RobustScaler().fit(y)
    X_scal = transformer_x.transform(X)
    y_scal = transformer_y.transform(y)
    
    #LASSO
    lasso = Lasso()
    lasso = lasso.fit(X_scal, y_scal)
    
    def pred_val(X_test):
    
        print('X entered: ',)
        print (X_test.values[0])
    
        #Scale X value that user entered - by hand
        med_X = X.median()
        Q1_X = X.quantile(0.25)
        Q3_X = X.quantile(0.75)
        IQR_X = Q3_X - Q1_X
        X_scaled = ((X_test - med_X)/IQR_X).fillna(0).values
        print('X_test scaled by hand: ',)
        print (X_scaled[0])
    
        #Scale X value that user entered - by function
        X_scaled2 = transformer_x.transform(X_test)
        print('X_test scaled by function: ',)
        print (X_scaled2[0])
    
        #Intercept by hand
        med_y = y.median()
        Q1_y = y.quantile(0.25)
        Q3_y = y.quantile(0.75)
        IQR_y = Q3_y - Q1_y
    
        a = lasso.coef_
        coef_new = ((a * (X_test - med_X).values) / (X_test * IQR_X).values) * float(IQR_y)
        coef_new = np.nan_to_num(coef_new)[0]
    
        b = lasso.intercept_[0]
        intercept_new = b * float(IQR_y) + float(med_y)
    
        custom_pred = sum((coef_new * X_test.values)[0]) + intercept_new
    
        pred = lasso.predict(X_scaled2)
        final_pred = transformer_y.inverse_transform(pred.reshape(-1, 1))[0][0]
    
    
        print('Original intercept: ', lasso.intercept_[0].round(2))
        print('New intercept: ', intercept_new.round(2))
        print('Original coefficients: ', lasso.coef_.round(2))
        print('New coefficients: ', coef_new.round(2))
        print('Your predicted value by function is: ', final_pred.round(2))
        print('Your predicted value by hand is: ', custom_pred.round(2))
    
    
    X_test = pd.DataFrame([10,1,1,1]).T
    X_test.columns = ['X1', 'X2', 'X3', 'X4']
    
    pred_val(X_test)
    

    You can see the the custom prediction uses the original values (X_test.values).

    Result:

    X entered: 
    [10  1  1  1]
    
    X_test scaled by hand: 
    [ 5.96774194  0.         -6.66666667 -1.        ]
    X_test scaled by function: 
    [ 5.96774194  0.         -6.66666667 -1.        ]
    
    Original intercept:  0.01
    New intercept:  3.83
    
    Original coefficients:  [ 0.02  0.   -0.   -0.  ]
    New coefficients:  [0.1 0.  0.  0. ]
    
    Your predicted value by function is:  4.83
    Your predicted value by hand is:  4.83
    

    As I explained above, the new coefficients depend on X_test. This means that you cannot use their current values with another test sample. Their values will be different for different inputs.