Search code examples
pythonscikit-learncross-validation

Calculate the average of each metric in cross validation


I'm trying to calculate some metrics using StratifiedKFold cross validation.

skfold = StratifiedKFold(n_splits=5, random_state=42, shuffle=True)

dtc_score = cross_validate(models[0], X, y, scoring=('accuracy', 'precision', 'recall', 'f1'), cv=skfold, n_jobs=-1, verbose=1)
rfc_score = cross_validate(models[1], X, y, scoring=('accuracy', 'precision', 'recall', 'f1'), cv=skfold, n_jobs=-1, verbose=1)
abc_score = cross_validate(models[2], X, y, scoring=('accuracy', 'precision', 'recall', 'f1'), cv=skfold, n_jobs=-1, verbose=1)
etc_score = cross_validate(models[3], X, y, scoring=('accuracy', 'precision', 'recall', 'f1'), cv=skfold, n_jobs=-1, verbose=1)
gbc_score = cross_validate(models[4], X, y, scoring=('accuracy', 'precision', 'recall', 'f1'), cv=skfold, n_jobs=-1, verbose=1)
bgc_score = cross_validate(models[5], X, y, scoring=('accuracy', 'precision', 'recall', 'f1'), cv=skfold, n_jobs=-1, verbose=1)
knn_score = cross_validate(models[6], X, y, scoring=('accuracy', 'precision', 'recall', 'f1'), cv=skfold, n_jobs=-1, verbose=1)
logreg_score = cross_validate(models[7], X, y, scoring=('accuracy', 'precision', 'recall', 'f1'), cv=skfold, n_jobs=-1, verbose=1)
nb_score = cross_validate(models[8], X, y, scoring=('accuracy', 'precision', 'recall', 'f1'), cv=skfold, n_jobs=-1, verbose=1)
svm_score = cross_validate(models[9], X, y, scoring=('accuracy', 'precision', 'recall', 'f1'), cv=skfold, n_jobs=-1, verbose=1)
xgb_score = cross_validate(models[10], X, y, scoring=('accuracy', 'precision', 'recall', 'f1'), cv=skfold, n_jobs=-1, verbose=1)
mlp_score = cross_validate(models[11], X, y, scoring=('accuracy', 'precision', 'recall', 'f1'), cv=skfold, n_jobs=-1, verbose=1)

After that I put the results of each calculation into the dataframe, but the results don't match what I want.

cv_result = [
    dtc_score, rfc_score, abc_score, etc_score, gbc_score, bgc_score, 
    knn_score, logreg_score, nb_score, svm_score, xgb_score, mlp_score]


df_cv_result = pd.DataFrame(cv_result, index=model_name)
df_cv_result

The result is like this:

fit_time score_time test_accuracy test_precision test_recall test_f1
DecisionTreeClassifier [0.06297850608825684, 0.06297850608825684, 0.1... [0.025590181350708008, 0.025590181350708008, 0... [0.783008658008658, 0.7943722943722944, 0.7662... [0.7193229901269393, 0.7398843930635838, 0.708... [0.7162921348314607, 0.7191011235955056, 0.668... [0.7178043631245602, 0.7293447293447292, 0.687...
RandomForestClassifier [1.759207010269165, 1.774831771850586, 1.75920... [0.10936832427978516, 0.10936856269836426, 0.1... [0.8257575757575758, 0.8138528138528138, 0.806... [0.7809798270893372, 0.7697947214076246, 0.769... [0.7612359550561798, 0.7373595505617978, 0.709... [0.7709815078236132, 0.7532281205164992, 0.738...
AdaBoostClassifier [0.7297384738922119, 0.7453627586364746, 0.721... [0.07508277893066406, 0.07508540153503418, 0.0... [0.8235930735930735, 0.8295454545454546, 0.818... [0.796923076923077, 0.7976011994002998, 0.8006... [0.7275280898876404, 0.7471910112359551, 0.705... [0.7606461086637298, 0.7715736040609138, 0.749...
ExtraTreesClassifier [1.8575339317321777, 1.888782024383545, 1.8731... [0.12499260902404785, 0.12499213218688965, 0.1... [0.808982683982684, 0.8008658008658008, 0.7916... [0.760522496371553, 0.7478386167146974, 0.7599... [0.7359550561797753, 0.7289325842696629, 0.671... [0.7480371163454677, 0.7382645803698435, 0.712...
GradientBoostingClassifier [2.078220844268799, 2.150218963623047, 2.11822... [0.02400040626525879, 0.027779102325439453, 0.... [0.8365800865800865, 0.8344155844155844, 0.825... [0.8005865102639296, 0.8038922155688623, 0.791... [0.7668539325842697, 0.7542134831460674, 0.742... [0.7833572453371592, 0.7782608695652172, 0.766...
BaggingClassifier [0.486358642578125, 0.486358642578125, 0.47073... [0.015625953674316406, 0.039700984954833984, 0... [0.8143939393939394, 0.808982683982684, 0.7927... [0.786046511627907, 0.7765793528505393, 0.7683... [0.7120786516853933, 0.7078651685393258, 0.661... [0.7472365512159176, 0.7406318883174137, 0.710...
KNeighborsClassifier [0.01562809944152832, 0.01562809944152832, 0.0... [0.9189648628234863, 0.9033389091491699, 0.918... [0.8041125541125541, 0.8122294372294372, 0.803... [0.7708978328173375, 0.77526395173454, 0.76687... [0.699438202247191, 0.7219101123595506, 0.7022... [0.7334315169366716, 0.7476363636363637, 0.733...
LogisticRegression [0.12395191192626953, 0.14353656768798828, 0.1... [0.01958465576171875, 0.04000663757324219, 0.0... [0.8079004329004329, 0.8181818181818182, 0.811... [0.7856, 0.7955974842767296, 0.793214862681744... [0.6896067415730337, 0.7106741573033708, 0.689... [0.7344801795063575, 0.7507418397626112, 0.737...
GaussianNB [0.02400040626525879, 0.02400040626525879, 0.0... [0.02399587631225586, 0.02399587631225586, 0.0... [0.7738095238095238, 0.7797619047619048, 0.764... [0.7963709677419355, 0.8105906313645621, 0.812... [0.5547752808988764, 0.5589887640449438, 0.505... [0.6539735099337749, 0.6616791354945969, 0.623...
SVC [5.1321375370025635, 5.216028690338135, 4.9741... [2.7087018489837646, 2.7156307697296143, 2.633... [0.8327922077922078, 0.8295454545454546, 0.824... [0.7933042212518195, 0.794074074074074, 0.7833... [0.7654494382022472, 0.7528089887640449, 0.751... [0.7791279485346676, 0.7728911319394376, 0.767...
XGBClassifier [2.78363299369812, 2.78363299369812, 2.7516334... [0.03450345993041992, 0.03450345993041992, 0.0... [0.8262987012987013, 0.8208874458874459, 0.808... [0.7879234167893961, 0.7822222222222223, 0.775... [0.7514044943820225, 0.7415730337078652, 0.707... [0.7692307692307693, 0.7613554434030282, 0.740...
MLPClassifier [20.106656074523926, 20.14053773880005, 19.947... [0.023995399475097656, 0.0240020751953125, 0.0... [0.8327922077922078, 0.8295454545454546, 0.819... [0.7899280575539568, 0.7815602836879433, 0.773... [0.7710674157303371, 0.773876404494382, 0.7528... [0.7803837953091683, 0.7776993648553283, 0.762...

I don't want to include every k-fold and just want to calculate the average. How do I calculate the average for each metric?


Solution

  • You can simply do as follow to average your score:

    cv_result = [
        dtc_score, rfc_score, abc_score, etc_score, gbc_score, bgc_score, 
        knn_score, logreg_score, nb_score, svm_score, xgb_score, mlp_score]
    
    results = [{k : v.mean() for k, v in result.items()} for result in cv_result]
    df_cv_result = pd.DataFrame(results, index=model_name)
    

    Note: you might want to calculate the standard deviation as it might help you choose to best model.

    EDIT:

    It might not be the most optimal solution but you could do as follows to add both mean and std:

    results = []
    for result in cv_result:
        tmp = dict()
        for k, v in result.items():
            tmp.update({f'{k}_mean()': v.mean(), f'{k}_std' : v.std()})
        results.append(tmp)
    df_cv_result = pd.DataFrame(results, index=model_name)