Is there any rule of thumb to initialize the num_leaves
parameter in lightgbm
. For example for 1000
featured dataset, we know that with tree-depth
of 10
, it can cover the entire dataset, so we can choose this accordingly, and search space for tuning also get limited.
But in lightgbm
, how we can roughly guess this parameters, otherwise its search space will be pretty large while using grid-search method.
Any intuition on selecting this parameters will be helpful.
The best recommendation, that I bumped into is this awesome summary by Laurae on lightgbm github. As always, this very much depends on your data.
My personal rule of thumb based on limited kaggle experience is to start by trying values in the range [10,100]
. But if you have a solid heuristic to choose tree depth you can always use it and set num_leaves
to 2^tree_depth - 1