Search code examples
pythonmachine-learningscikit-learnlasso-regressionbisection

How to change alpha value of sklearn Lasso object for bisection


I'm trying to implement an algorithm where I efficiently search for the value of alpha (l1 regularization parameter) for a lasso problem which results in a given number of nonzero features.

To do this, I was planning on initializing an sklearn Lasso object, computing the coefficient vector for the problem, then changing the object's alpha value to compute the coefficient vector for the subsequent problem. Doing this would allow me to take advantage of "warm start," which uses the coefficient vector for the previous alpha as the initial vector for the algorithm run on the next alpha, resulting in faster convergence.

The problem is, sklearn did not include a way to alter the value of alpha of a Lasso object (which is kind of unfathomable, considering that seems to me to be the only point of "warm start.")

How would I accomplish my goal of implementing bisection on Lasso with warm start? Should I just use lasso_path and loop, feeding in the coefficient vector manually? Why would sklearn not include this obvious feature, am I being dumb?


Solution

  • All Scikit-learn objects that can be part of a pipeline has a get_params and set_params method. get_params returns a dictionary of the object parameters. set_params is used to update parameters with new values. See the following example code

    import numpy as np
    from sklearn.linear_model import Lasso
    
    # Make some random data
    x = np.random.random((100,4))
    y = np.random.random(100)
    
    m = Lasso(warm_start = True)
    m.fit(x,y)
    
    # Print out the current params
    print(m.get_params())
    # The output will be
    #{'alpha': 1.0, 'copy_X': True, 'fit_intercept': True, 
    # 'max_iter': 1000, 'normalize': False, 'positive': False, 
    # 'precompute': False, 'random_state': None, 'selection': 'cyclic',
    # 'tol': 0.0001, 'warm_start': True}
    
    # We can update the alpha value
    m.set_params(alpha = 2.0)
    
    # Fit again if we want
    m.fit(x,y)
    
    # Print out the current params
    print(m.get_params())
    # The output will be
    #{'alpha': 2.0, 'copy_X': True, 'fit_intercept': True, 
    # 'max_iter': 1000, 'normalize': False, 'positive': False, 
    # 'precompute': False, 'random_state': None, 'selection': 'cyclic',
    # 'tol': 0.0001, 'warm_start': True}