as I understand, support vector regression in Scikit learn takes an integer for the degree. However, it seems to me as if lower degree polynomials are not considered.
Running the following example:
import numpy
from sklearn.svm import SVR
X = np.sort(5 * np.random.rand(40, 1), axis=0)
Y=(2*X-.75*X**2).ravel()
Y[::5] += 3 * (0.5 - np.random.rand(8))
svr_poly = SVR(kernel='poly', C=1e3, degree=2)
y_poly = svr_poly.fit(X, Y).predict(X)
(as copied and slightly modified from here http://scikit-learn.org/stable/auto_examples/svm/plot_svm_regression.html)
Plotting the data gives a rather poor fit (even when skipping line 5 where a random error is given to the Y-values).
It seems like lower order terms are not considered. I tried to pass a list [1, 2]
for the degree
parameter but then I got an error for the predict
command. Is there any way to include them? Did I miss something obvious?
I think the lower order polynomial terms are included in the fitted model but are not visible in the plot since the C
and epsilon
parameters are not well suited for the data. One can usually obtain a better fit by fine-tuning the parameters with GridSearchCV
. Since in this case the data is not centered the coef0
parameter also has a significant effect.
The following parameters should give a better fit for the data:
svr_poly = SVR(kernel='poly', degree=2, C=100, epsilon=0.0001, coef0=5)