Search code examples
scikit-learnclassificationsignificance

significant test using scikit-learn's permutation test results in the same p-value for all classifiers


I'm trying to find out the significance of the results using scikit-learn's permutation test as in:

score, permutation_scores, pvalue = permutation_test_score(clf.best_estimator_, X_train, Y_train, cv=10, n_jobs=10, n_permutations=100, scoring='accuracy')

where the clf.best_estimator is the result of cross-validation.

I use it for several classifiers (several independent clf.best_estimator_) but the p-values for all of them is the same 0.00990099009901.

I have no idea why this happens. The strange thing is that this is the same number that is reported in the linked code in scikit-learn user guide.


Solution

  • I asked the same question in scikit-learn's issues and the answer was: for most of the good classifiers if the random classifier is better than the trained classifier in 1 test out of 100, this magic number would be the result.

    so there's nothing wrong with this magic number.