I have a substantially large dataset which includes more than 100 coefficients and thousands of entries. Therefore, I would like to use the Lasso approach for model training.
I am currently looking into the sci-kit documentation for:
Although the implementation seems straight forward, I was unable to find an input argument which allows restricting the maximum number of non-zero coefficients, e.g. to 10.
To be more clear, in the MatLab implementation of Lasso, the parameter 'DFMax' allows for the above.
Is there such an option in any Python implementation?
Restricting directly the number of nonzero coefficients is an NP-hard problem, and this is one of the beauty of LASSO which asymptotically solves this NP-hard problem.
I don't know the implement of DFMax in Matlab, but my suggestion is do the following: