Search code examples
pythonmachine-learningscikit-learnclassificationsupervised-learning

Each time accuracy differences with classifier?


Each time when I run this code, accuracy comes out different. Can anyone please explain why? Am I missing something here ? Thanks in advance :)

Below is my code:

import scipy
import numpy
from sklearn import datasets
iris = datasets.load_iris()

X = iris.data
y = iris.target

X_train, X_test, y_train,y_test = train_test_split(X,y, test_size = .5)

# Use a classifier of K-nearestNeibour
from sklearn.neighbors import KNeighborsClassifier
my_classifier = KNeighborsClassifier()

my_classifier.fit(X_train,y_train)
predictions = my_classifier.predict(X_test)
print(predictions)

from sklearn.metrics import accuracy_score
print(accuracy_score(y_test,predictions))

Solution

  • train_test_split randomly splits the data into training and test sets, and so you will get different splits each time you run the script. If you want, there's a random_state parameter that you can set to some number and it will ensure that you get the same split each time you run the script:

    X_train, X_test, y_train,y_test = train_test_split(X,y, test_size = .5, random_state = 0)
    

    This should give you an accuracy of 0.96 every time.