Search code examples
pythonmachine-learninglightgbmboosting

How does LightGBM convert feature_fraction to an integer value?


Does anyone know how does lightgbm convert feature_fraction parameter, which is defined by user as non-integer like 0.8, to integer values?

Does it use floor or ceiling function?

I could not find it in the documentation. (and a skim over the source codes on GitHub)

Doc says:

feature_fraction , default = 1.0, type = double, ... , constraints: 0.0 < feature_fraction <= 1.0 LightGBM will randomly select a subset of features on each iteration (tree) if feature_fraction is smaller than 1.0. For example, if you set it to 0.8, LightGBM will select 80% of features before training each tree.

if I have three features, what does feature_fraction = 0.5 mean? Each decision tree uses how many features at each split? 1 or 2?


Solution

  • I asked the question on microsoft/LightGBM GitHub. LightGBM will convert with this formula, say in Python:

    total_cnt = 3 # I have three features
    
    # I like my decision trees use 50 % of features at each split 
    feature_fraction = 0.5 
    
    # this is the integer value: number of features at each split
    max_features = np.floor((feature_fraction * total_cnt) + 0.5) 
    

    For more detail please take a look at here.