Search code examples
machine-learningscikit-learnlogistic-regressionsklearn-pandas

sklearn SGDClassifier fit() vs partial_fit()


I am confused about fit() and partial_fit() method of SGDClassifier. Documentation says for both, "Fit linear model with Stochastic Gradient Descent.".

What I know about stochastic gradient descent is, it takes one (or a fraction of whole) training example to update parameters of model in one iteration. And gradient descent uses whole data set in each iteration. I want to train a model using logistic regression. I want to implement normal Gradient Descent and Stochastic gradient descent and compare time required for them. How to do that with SGDClassifier? Does fit() method works as normal gradient descent?


Solution

  • I think the partial_fit method is useful for updating a model that has already been trained, whereas the fit method will re-train the model from scratch.

    As for manually selecting how much of the data is included in each weight update, I can't seem to find an argument for this in the SGDClassifier documentation.