I have made a model using the following codes. There is one categorical variable among my variables called "ot_soilTextu."
rf_final <- caret::train(BULK_DENSITY ~ forest_210 + legumes_158 + corn_147 + bio3 +
ot_soilTextu + grass_122 + mrvbf + bio5 + bio15 + bio9 +
grav_1st_1 + bio18 + deme2000 + grav_1st_2,
method = "rf",
data=cq,
tuneGrid = expand.grid(mtry = rf_CV$bestTune$mtry),
trControl = trainControl(method = "none"),
importance = TRUE)
Then, I made the importance plot using the following codes.
imp <- varImp(rf_final)
plot(imp, main="BD: 14V ",xlab = list(font=1, cex = 1.25),
scales = list(x = list(font=1,cex=1),y=list(font=1,cex=1)))
The importance plot shows all splits of the categorical variable, which is not good. I need to see the importance of 14 variables only and not all the splits of the categorical variable added. Is there a way to solve this problem without writing extra codes? Here is the importance plot:
You can convert your categorical variable into numeric before running caret::train
function like
cq$ot_soilTextu <- as.numeric(cq$ot_soilTextu)
Then you can run caret::train
to only have the importance of 14 variables.