When evaluating the performance of SVM, RF and a DT (max_depth = 3), I am getting really superior results with the RF model. The data being modeled on is real world data. They are all evaluated using stratified cross validation, since the data set is imbalanced.
For the 4 different classes seen before, I am getting these scores for precision, recall and F1.
Originally, the data set contained the following values_counts for the 4 classes shown below:
How could RF be so much better than SVM and DT?
Thanks in advance!
These results are entirely plausible! The Random Forest is a much more powerful algorithm than the Decision Tree, because it basically is an ensemble of DTs. Ensembles (a combination of more models) are notoriously powerful in Machine Learning when it comes to generalisation on unseen data. Where the Decision Tree or SVMs overfit, the Random Forest usually performs relatively well, because internally many DTs seeing all a different set of features are casting a vote for the result.