Search code examples
machine-learningxgboost

XGBoostError : value for parameter exceed bound


I am training a XGBoost model and doing hyper-parameter tuning using randomizedSearchCV. I specify the parameter distribution as:

from xgboost import XGBRegressor

# Define a xgboost regression model
model = XGBRegressor()

params = {
      "colsample_bytree": uniform(0.1, 0.2), # fraction of cols to sample
      "gamma": uniform(0, 0.3), # min loss reduction required for next split
      "learning_rate": uniform(0.02, 0.3), # default 0.1 
      "n_estimators": randint(100, 150), # default 100
      "subsample": uniform(0.8, 0.75) # % of rows to use in training sample
}

r = RandomizedSearchCV(model, param_distributions=params, n_iter=100, 
scoring="neg_mean_absolute_error", cv=3, n_jobs=1)

I get the following error when even though the range I have specified for subsample is lower the bound [0,1].

raise XGBoostError(py_str(_LIB.XGBGetLastError()))
   xgboost.core.XGBoostError: value 1.10671 for Parameter subsample exceed bound [0,1]

  warnings.warn("Estimator fit failed. The score on this train-test"

Any ideas why this could be happening?


Solution

  • I think the issue comes from:

    uniform(0.8, 0.75)
    

    For numpy and random the first value of the function defines the lower limit and the second value the upper limit. Hence, for numpy and random you want:

    uniform(0.75, 0.8)
    

    This applies for numpy.random.uniform and random.uniform:

    However, for scipy.stats.uniform the definition is slightly different. Which is "Using the parameters loc and scale, one obtains the uniform distribution on [loc, loc + scale]." So for scipy you want:

    uniform(0.75, 0.05)