Search code examples
rplyrperfect-square

Extract r^2 from multiple models in plyr


I am hoping to efficiently combine my regressions using plyr functions. I have data frames with monthly data for multiple years in format yDDDD (so y2014, y2013, etc.)

Right now, I have the below code for one of those dfs, y2014. I am running the regressions by month, as desired within each year.

modelsm2= by(y2014,y2014$Date,function(x) lm(y~,data=x))
summarym2=lapply(modelsm2,summary)
coefficientsm2=lapply(modelsm2,coef)
coefsm2v2=ldply(modelsm2,coef) #to get the coefficients into an exportable df

I have several things I'd like to do and I would really appreciate your help!

A. Extract the r^2 for each model. I know that for one model, you can do summary(model)$r.squared to get it, but I have not had luck with my construct.

B. Apply the same methodology in a loop-type structure to get the models to run for all of my data frames (y2013 and backwards)

C. Get the summary into an easily exportable (to Excel) format --> the ldply function does not work for the summaries.

Thanks again.


Solution

  • A. You need to subset out the r.squared values from your summaries:

    lapply(summarym2,"[[","r.squared")
    

    B. Put all your data into a list, and put another lapply around it, eg:

    lapply(list(y2014,y2013,y2012), function(dat)
                                       by(dat,dat$Date, function(x) lm(y~.,data=x))
          )
    

    You will then have a list of lists so for instance to extract the summaries, you would use:

    lapply(lmlist,lapply,summary)
    

    C. summary returns a fairly complex data structure that cannot be coerced into a data.frame. The result you see is a consequence of the print method for it. You can use capture.output to get a charactor vector of each line of the output that you may use to write to a file.