Search code examples
pythonmachine-learningscikit-learnknn

fit method in Sklearn. when using KNeighborsClassifier


from sklearn.neighbors import KNeighborsClassifier

knn_clf =KNeighborsClassifier() 
knn_clf.fit(x_train[:92000],y_train[:92000]) #1st method call
knn_clf.fit(x_train[92000:123000],y_train[92000:123000]) #2nd method call

My doubt is when I call fit method like this does the 2nd call trains the model once again from scratch ? or it will add to what it has learned from the previous fit call(1st method call) ?

What I am trying to achieve is to do batch wise training as laptop can't handle the data if I use the complete dataset at once! Thanks in Advance :-)


Solution

  • Every time when you call fit method, it tries to fit the model. If you call fit method multiple times, it will try to refit the model & as @Julien pointed out, batch training doesn't make any sense for KNN.

    KNN will consider all the data points & pick up the top K nearest neighbors.So if your data is large it would take more time.

    All you can do is downscale your data or increasing your system memory size.