I have this code in python which is creating an SVM using different kernels:
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, roc_auc_score
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3)
# The gamma parameter is the kernel coefficient for kernels rbf/poly/sigmoid
svm = SVC(gamma='auto', probability=True)
svm.fit(X_train,y_train.values.ravel())
prediction = svm.predict(X_test)
prediction_prob = svm.predict_proba(X_test)
print('Accuracy:', accuracy_score(y_test,prediction))
print('AUC:',roc_auc_score(y_test,prediction_prob[:,1]))
print(X_train)
print(y_train)
Now I want to build this with a different kernel rbf and store the values into arrays.I need to store the outcome of the folds into the lists as shown below: so something like this
def svm_grid_search(parameters, cv):
# Store the outcome of the folds in these lists
means = []
stds = []
params = []
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3)
for parameter in parameters:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3)
# The gamma parameter is the kernel coefficient for kernels rbf/poly/sigmoid
svm = SVC(gamma=1,kernel ='rbf',probability=True)
svm.fit(X_train,y_train.values.ravel())
prediction = svm.predict(X_test)
prediction_prob = svm.predict_proba(X_test)
return means, stddevs, params
The answer should look like the code as shown below using the different parameters and kernel functions :
from sklearn.model_selection import GridSearchCV
parameters = {'kernel':['linear','poly','rbf'],'C':[0.2,0.5,1.0]}
means, stddevs, params = svm_grid_search(parameters, 10)
print('Mean AUC (+/- standard deviation), for parameters')
for mean, std, params in zip(means, stddevs, params):
print("%0.3f (+/- %0.03f) for %r"
% (mean, std, params))
however I keep getting this error, which suggests I have not passed the folds through the lists correctly ?
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-7-3f5f99897a92> in <module>()
3 parameters = {'kernel':['linear','poly','rbf'],'C':[0.2,0.5,1.0]}
4
----> 5 means, stddevs, params = svm_grid_search(parameters, 10)
6
7 print('Mean AUC (+/- standard deviation), for parameters')
TypeError: 'NoneType' object is not iterable
Grateful if anyone can point me in the right direction
It looks like you aren’t actually storing anything in the lists you create in your function. Also you are creating an empty list called stds but returning stddevs (which you never define).
Edit: I think you are trying to use sklearns GridSerachCV but you never call the either. Here is some code to get you started:
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC
from sklearn import datasets
import pandas as pd
iris = datasets.load_iris()
X = iris.data[:, :2] # we only take the first two features.
y = iris.target
parameters = {'kernel':['linear','poly','rbf'],'C':[0.2,0.5,1.0]}
svc = SVC()
clf = GridSearchCV(svc, parameters)
clf.fit(X,y)
results = pd.DataFrame.from_dict(clf.cv_results_)
where results is then the result of your grid search.
mean_fit_time std_fit_time mean_score_time std_score_time param_C
0 0.000600 0.000490 0.0004 0.000490 0.2
1 0.000400 0.000490 0.0002 0.000400 0.2
2 0.000600 0.000490 0.0004 0.000490 0.2
3 0.001008 0.000013 0.0000 0.000000 0.5
4 0.000800 0.000400 0.0002 0.000399 0.5
5 0.000800 0.000400 0.0004 0.000490 0.5
6 0.000800 0.000400 0.0000 0.000000 1.0
7 0.000800 0.000400 0.0002 0.000400 1.0
8 0.000800 0.000400 0.0002 0.000400 1.0
If you want to wrap that in your own function then you need to read up some on python functions and what GridSearchCV returns. Hope it helps.