I can't find the detailed description of how bin mapping is constructed in lightgbm paper. I have several questions about bin mapping.
Is it static or dynamic? That is, during the growth of nodes, does the bin mapping change?
Does the number of bins of each feature dimension equal? For example, for one hot feature, does the number of bins equal to 2?
For real-valued feature, are the split points of bins uniformly distributed? Or any principles to find the split points of bins?
1: Bins are a form of preprocessing : each variable is converted to discrete values before the optimization. It is specific to your training data and does not change.
2: There is a parameter you can tune to set the maximum number of bin. But of course if your feature only has 5 different values there will be only 5 bins. So you can have different number of bins per feature.
3: The split points for the bins are not chosen by equal width, they are chosen by frequency : If you set 100 bins, the split points will be chosen such as each bin contains approximately 1% of all your training points (it could be more or less depending if you have equal values). This process is similar to the pandas qcut function.
Hope I have covered your questions.