Search code examples
pandasregressionrandom-effects

Store regression coefficients, merge back into data-frame


I'm trying to estimate a random effects model, and store those coefficients. I then want to merge them to the data-frame to predict the dependent variable.

There is a random effect coefficient for each group. In the data-frame, if an observation belongs to group 1, I want the group 1 coefficient listed there. For observations in group 2, the group 2 coefficient and so on.

I am able to access and store the coefficients. But I'm not able to merge them back into the data-frame. I'm not sure how to think of it. Here is the code I have so far:

md = smf.mixedlm('y ~ x', data=df, groups=train['GroupID'])
mdf = md.fit()

I tried storing the coefficients in three ways:

re_coeffs = pd.Series(mdf.random_effects.values) #creates a series with shape (1,)

re_coeffs = [(k) for k in mdf.random_effects.values()] #creates a list with the coeffs

re_coeffs = np.array(mdf.random_effects.values) #creates array with shape ()

All of them work, but none of them let me merge them back into the original data-frame. I'm not sure about using a dictionary or a list, or generally how to think about merging these coefficients back into the original data-frame.

I'll appreciate any suggestions for this.


Solution

  • This seems to work:

    md = smf.mixedlm('y ~ x', data=train, groups=train['GroupID'])
    mdf = md.fit()
    
    re_coeffs = [(k) for k in mdf.random_effects.values()]
    df = pd.DataFrame(re_coeffs)
    
    df['ConfigID'] = df.index 
    merged = pd.merge(train,df, on=['GroupID'])