python scikit-learn regression svm non-linear-regression

Finding mixed degree polynomials in Scikit learn support vector regression

as I understand, support vector regression in Scikit learn takes an integer for the degree. However, it seems to me as if lower degree polynomials are not considered.

Running the following example:

import numpy
from sklearn.svm import SVR
X = np.sort(5 * np.random.rand(40, 1), axis=0)
Y=(2*X-.75*X**2).ravel()
Y[::5] += 3 * (0.5 - np.random.rand(8))
svr_poly = SVR(kernel='poly', C=1e3, degree=2)
y_poly = svr_poly.fit(X, Y).predict(X)

(as copied and slightly modified from here http://scikit-learn.org/stable/auto_examples/svm/plot_svm_regression.html)

Plotting the data gives a rather poor fit (even when skipping line 5 where a random error is given to the Y-values).

It seems like lower order terms are not considered. I tried to pass a list [1, 2] for the degree parameter but then I got an error for the predict command. Is there any way to include them? Did I miss something obvious?

Solution

I think the lower order polynomial terms are included in the fitted model but are not visible in the plot since the C and epsilon parameters are not well suited for the data. One can usually obtain a better fit by fine-tuning the parameters with GridSearchCV. Since in this case the data is not centered the coef0 parameter also has a significant effect.

The following parameters should give a better fit for the data:

svr_poly = SVR(kernel='poly', degree=2, C=100, epsilon=0.0001, coef0=5)