Search code examples
pythonpandasmachine-learningscikit-learnstatsmodels

for loop to print logistic regression stats summary | statsmodels


I'm trying to figure out how to implement a for loop in statsmodels to get the statistics summary for a logistic regression (Iterate through independent variables list). I can get it to work fine with the traditional method, but using a for loop will make my life easier to find significance between variables.

Here is what I'm trying to do:

df = pd.read_csv('source/data_cleaning/cleaned_data.csv')

def opportunites():
    dep = ['LEAVER']
    indep = ['AGE', 'S0287', 'T0080', 'SALARY', 'T0329', 'T0333', 'T0159', 'T0165', 'EXPER', 'T0356']
    for i in indep:
        model = smf.logit(dep, i, data = df ).fit()
        print(model.summary(yname="Status Leaver", xname=['Intercept', i ],  
        title='Single Logistic Regression'))
        print()
opportunites()

Here is the traditional method that works

def regressMulti2():
    model = smf.logit('LEAVER ~ AGE ', data = df).fit()
    print(model.summary(yname="Status Leaver",
    xname=['Intercept', 'AGE Less than 40 (AGE)'], title='Logistic Regression of Leaver and Age'))
    print()

regressMuti2()


Solution

  • def opportunites():
        indep = ['AGE', 'S0287', 'T0080', 'SALARY', 'T0329', 'T0333', 'T0159', 'T0165', 'EXPER', 'T0356']
        for i in indep:
            model = smf.logit(f'LEAVER ~ {i} ', data = df).fit()
            print(model.summary(
                yname="Status Leaver",
                xname=['Intercept', i],
                title=f'Logistic Regression of Leaver and {i}'
            ))
            print()