Search code examples
machine-learningstatisticsscikit-learnmathematical-optimization

Penalties on variables in scikit-learn GradientBoostingClassifier?


Is there a way to penalize a feature in so that it doesn't dominate the model? (In Salford Predictive Modeller, there is a setting called "Penalties on Variables")

The situation is that I have one categorical feature which I want to include in the model, but I don't want to have as the most important feature, since then the model doesn't properly capture the variance explained by the other predictors.


Solution

  • I think you cannot do that. Although I don't really understand why you would want to do that, you could try the following: Train a model on the whole dataset, train a separate model on the dataset after removing this feature. Then, combine the results of the two models (maybe simple averaging or stacking etc.)