Search code examples
pythonpython-3.xmachine-learningscikit-learnsklearn-pandas

How to use custom scoring function in sklearn cross_val_score


I want to use Adjusted Rsquare in the cross_val_score function. I tried with make_scorer function but it is not working.

from sklearn.cross_validation import train_test_split
X_tr, X_test, y_tr, y_test = train_test_split(X, Y, test_size=0.2, random_state=0)

regression = LinearRegression(normalize=True)
from sklearn.metrics.scorer import make_scorer
from sklearn.metrics import r2_score
def adjusted_rsquare(y_true,y_pred):
    adjusted_r_squared = 1 - (1-r2_score(y_true, y_pred))*(len(y_pred)-1)/(len(y_pred)-X_test.shape[1]-1)
    return adjusted_r_squared

my_scorer = make_scorer(adjusted_rsquare, greater_is_better=True)
score = np.mean(cross_val_score(regression, X_tr, y_tr, scoring=my_scorer,cv=crossvalidation, n_jobs=1))

It is trowing an error:

IndexError: positional indexers are out-of-bounds

Is there any way to use my custom function i.e; adjusted_rsquare with cross_val_score?


Solution

  • adjusted_rsquare(X,Y) is a number, it's not a function, just create the scorer like this:

    my_scorer = make_scorer(adjusted_rsquare, greater_is_better=True)
    

    You also need to change the score function:

    def adjusted_rsquare(y_true, y_pred, **kwargs):
    

    That's the prototype that you should use. You compare the actual result to the result it should have been.