Search code examples
machine-learningscikit-learnvariancecurvestradeoff

High bias or variance? - SVM and weired learning curves


I have never seen such learning curves. Am I right, that huge overfitting occurs? The model is fitting better and better to the training data, while it generalizes worse for the test data.

Usually when there is high variance, like here, more examples should help. In this case, they won't, I suspect. Why is that? Why such example of learning curves can't be found easily in literature/tutorials?

Learning curves. SVM, param1 is C, param2 is gamma


Solution

  • You have to remember that SVM is non parametric model, thus more samples does not have to reduce variance. Reduction in variance can be more or less guaranteed for parametric model (like neural net), but SVM is not one of them - more samples mean not only better training data but also more complex model. Your learning curves are typical example of SVM overfitting, which happens a lot with RBF kernel.