Search code examples
rmultinomial

gbm multinomial distribution


I'm tryin to use gbm for the first time (actually any kind of regression tree for the first time) on my data, which consists of 14 continuous dependent variables and a factor as response variable with 13 levels. I came to gbm via a very good description by Elith et al., who however used a modification of the basic gbm package that can't handle multinomial distributions. The help of gbm claims that it can handle this:

"distribution: either a character string specifying the name of the distribution to use or a list with a component name specifying the distribution and any additional param-eters needed. If not specified, gbm will try to guess: if the response has only 2 unique values, bernoulli is assumed; otherwise, if the response is a factor, multinomial is assumed; otherwise, if the response has class "Surv", coxph is assumed; otherwise, gaussian is assumed. Currently available options are "gaussian" (squared error), "laplace" (absolute loss), "tdist" (t-distribution loss), "bernoulli" (logistic regression for 0-1 out-comes), "huberized" (huberized hinge loss for 0-1 outcomes), "multinomial" (classification when there are more than 2 classes), "adaboost" (the AdaBoost exponential loss for 0-1 outcomes), "poisson" (count outcomes), "coxph" (right censored observations), "quantile", or "pairwise" (ranking measure using the LambdaMart algorithm)."

Nevertheless, it doesn't work, no matter, whether I specify "multinomial" or "let it guess". Anyone any idea what I am doing wrong? Or am I misunderstanding something completely - does a multinomial distribution of my data not mean, that my error loss function is also of multinomial distribution? It runs if I chose "gaussian", but I guess in that case something completely different is calculated? I'd appreciate any help! agnes


Solution

  • Are you using the newest version of gbm? I had a similar issue which was resolved after re-installing the gbm package.