Search code examples
pythonscikit-learnstacked

How to combine already trained classifiers with StackingClassifier?


StackingClassifier in sklearn can stack several models. At the moment of the calling .fit method, the underlying models are trained.

A typical use case for StackingClassifier:

model1 = LogisticRegression()
model2 = RandomForest()

combination = StackingClassifier([model1, model2])

combination.fit(X_train, y_train)

However, what I need is the following:

model1 = LogisticRegression()
model1.fit(X_train_1, y_train_1)

model2 = RandomForest()
model2.fit(X_train_2, y_train_2)

combination = StackingClassifier([model1, model2], refit=False)

combination.fit(X_train_3, y_train_3)

where refit does not exist - it is what I would need.

I have already trained models model1, and model2 and do not want to re-fit them. I need just to fit the stacking model that combines these two. How do I elegantly combine them into one model that would produce an end-to-end .predict?

Of course, I can predict the first and the second model, create a data frame, and fit the third one. I would like to avoid that because then I cannot communicate the model as an end-to-end artifact.


Solution

  • You're close: it's cv="prefit", not refit=False. From the API docs:

    cv : int, cross-validation generator, iterable, or “prefit”, default=None

    [...]

    • "prefit" to assume the estimators are prefit. In this case, the estimators will not be refitted.