Search code examples
mlr

How to make sure that mlr take only 6 core


I am using following code on 8 core Linux and it takes all 8 cores (each of the 6 worker is taking 130% utilization)

    library(mlr); library(parallel); library(parallelMap)

# Load data
    iris_num <- iris; iris_num$Species <- as.numeric(iris_num$Species)

#create tasks
    traintask <- makeRegrTask(data = iris_num, target = 'Species')

#create learner
    lrn = makeLearner('regr.xgboost'); nthread <- min(6, detectCores());
    lrn$par.vals = list(print.every.n = 500, objective = "reg:linear", eval_metric="rmse", nthread = nthread)

#set parameter space

    params <- makeParamSet(
      makeIntegerParam("max_depth",lower = 5L,upper = 20L), # 6
      makeNumericParam("min_child_weight",lower = 1L,upper = 20L), # 1
      makeNumericParam("subsample",lower = 0.5,upper = 1),
      makeNumericParam("colsample_bytree",lower = 0.5,upper = 1),
      makeIntegerParam("nrounds",lower=3000,upper=5000),
      makeNumericParam("lambda",lower=0.75,upper=1),
      makeNumericParam("lambda_bias",lower=0,upper=0.75),
      makeNumericParam("gamma",lower=0,upper=1),
      makeNumericParam("eta", lower = 0.01, upper = 0.05) # 0.3
    )

#set resampling strategy
    rdesc <- makeResampleDesc("CV",iters=9L)

#search strategy
    ctrl <- makeTuneControlRandom(maxit = 10L)

#set parallel backend
    if(Sys.info()['sysname'] == "Linux") {
      parallelStartMulticore(cpus = nthread, show.info = T)
    } else parallelStartSocket(cpus = nthread, show.info = T)

    tune <- tuneParams(learner = lrn, task = traintask,resampling = rdesc,measures = rmse, par.set = params, control = ctrl, show.info = T)

How to make sure that mlr take only 6 core


Solution

  • nthread <- min(6, detectCores())
    

    This line here is going to immediately execute and will always return 6 on an 8 core machine. You use this line on both the xgboost model and for the tuning. Each of your 6 tuning threads will attempt to create an xgboost model which wants 6 threads. So you are making 36 threads on an 8 core machine.

    I'm unaware of a way to have mlr (or anything) respect the number of 'unused' cores. If you know you have a 6 core machine I would recommend manually breaking this up. For instance, give tuneParams 2 threads and give each xgboost model 2 threads. Since the tuneparams process will be idle and waiting to hear back from the xgboost models you could probably give the xgboost models 3 threads.