Search code examples
rdplyrlmsummary

How can I print the results of a summary and predict function by running a single code chunk using dplyr?


I am trying to fit several linear models using tidyverse in R. I am interested in printing out the results of the model fit using summary as well as a custom function designed to return statistical parameters not returned by summary like AIC values, and then apply this model to predict values in a set of known data (a test dataset). Here is an example of what I am doing using the mtcars dataset.

library(tidyverse);library(magrittr)
mtcars%>%
  filter(gear=="4")%$%
  lm(hp~mpg)%>%
  summary()
mtcars%>%
  filter(gear=="4")%$%
  lm(hp~mpg)%>%
  AIC()
mtcars%>%
  filter(gear=="4")%$%
  lm(hp~mpg)%>%
  predict(newdata=data.frame(mpg=19))

I am often doing a lot of filtering of my data before calling lm (due to missing data that are not missing for all models, using mutate calls, using summarise, or filtering based on a categorical variable of interest), and fitting many different model permutations. However, I end up having to call the same code multiple times in order to obtain the summary statistics.

Normally I would just save the lm models as an object but in this case I am interested in just running a preliminary test to see what the results look like to see if that version is worth saving, and I don't want large numbers of lm objects cluttering up my global environment. However it seems once a pipe is called after lm it is not possible to call the temporary lm object again.

Is there any tidy way to retain a fitted lm object and fork it in the same string of code such that I can print the results of a summary, predict, and AIC function in a single call?


Solution

  • A magritter pipeline allows for a code block where . is the value coming from the chain. So

    mtcars%>%
      filter(gear=="4")%$%
      lm(hp~mpg)%>% {list(
      summary(.),
      AIC(.),
      predict(., newdata=data.frame(mpg=19))
      )}
    

    Will work You could also kind of use the %T>% (tee) pipe. But you'll need to explicitly print the values or something in the chain if you want to see them

    mtcars%>%
      filter(gear=="4")%$%
      lm(hp~mpg) %T>%
      {print(summary(.))} %T>%
      {print(AIC(.))} %>%
      predict(newdata=data.frame(mpg=19))