Search code examples
pythonmachine-learningscikit-learnlogistic-regression

How to combine two logistic regression models using python and scikit?


I'm a Python and a Scikit newbie. I have two Logistic Regression models created with Scikit and I want to combine them to obtain a new model. In my mind is something like that:

clf1 = LogisticRegression()
clf1.fit(X_set, Y_set)
clf2 = LogisticRegression()
clf2.fit(X_set, Y_set)
combined_clf = clf1 + clf2

But I don't know how to do that. Thanks in advance to all.


Solution

  • Two methods might suit your needs here.

    The first one is to make each of your classifiers vote for the predicted class. To do so, you can use sklearn.ensemble.VotingClassifier. With your example:

    from sklearn.ensemble import VotingClassifier
    clf1 = LogisticRegression()
    clf2 = LogisticRegression()
    eclf1 = VotingClassifier(estimators=[('lr1', clf1), ('lr2', clf2),voting='hard')
    eclf1 = eclf1.fit(X, Y)
    

    The other one is stacking. Basically, the idea is to combine the output of multiple classifiers, and train a metaclassifier on the output of your first classifiers.

    Here is a useful link describing the method : https://rasbt.github.io/mlxtend/user_guide/classifier/StackingClassifier/.

    Using mlxtend and your example:

    from mlxtend.classifier import StackingClassifier
    clf1 = LogisticRegression()
    clf2 = LogisticRegression()
    lr = Your_Meta_Classifier()
    sclf = StackingClassifier(classifiers=[clf1, clf2], 
                          meta_classifier=lr)
    

    However, in your example, the models being trained with the same deterministic methods, I don't think stacking them would lead to any improvement.

    Hope it helped !

    (Oh, possible duplicate of : Ensemble of different kinds of regressors using scikit-learn (or any other python framework) ?)