Search code examples
rregressionpanelstargazerplm

How to get between and overall R2 from plm FE regression with stargazer?


Disclaimer: This question is extremely related to this one I asked two days ago - but now it relates to the implementation of between and overall R2 in stargazer() output not in summary() as before.

Is there a way to get plm() to calculate between R2 and overall R2 for me and include them in the stargazer() output?

To clarify what I mean with between, overall, and within R2 see this answer on StackExchange.

My understanding is that plm only calculates within R2. I am running a Twoways effects Within Model.

library(plm)
library(stargazer)

# Create some random data
set.seed(1) 
x=rnorm(100); fe=rep(rnorm(10),each=10); id=rep(1:10,each=10); ti=rep(1:10,10); e=rnorm(100)
y=x+fe+e

data=data.frame(y,x,id,ti)

# Get plm within R2
reg=plm(y~x,model="within",index=c("id","ti"), effect = "twoways", data=data) 
stargazer(reg)

I now also want to include between and overall R2 in the stargazer() output. How can I do that?

To make it explicit what I mean with between and overall R2:

# Pooled Version (overall R2)
reg1=lm(y~x)
summary(reg1)$r.squared

# Between R2
y.means=tapply(y,id,mean)[id]
x.means=tapply(x,id,mean)[id]

reg2=lm(y.means~x.means)
summary(reg2)$r.squared

Solution

  • To do this in stargazer, you can use the add.lines() argument. However, this adds the lines to the beginning of the summary stats section and there is no way to alter this without messing with the source code, which is beastly. I much prefer huxtable, which provides a grammar of table building and is much more extensible and customizable.

    library(tidyverse)
    library(plm)
    library(huxtable)
    
    # Create some random data
    set.seed(1) 
    x=rnorm(100); fe=rep(rnorm(10),each=10); id=rep(1:10,each=10); ti=rep(1:10,10); e=rnorm(100)
    y=x+fe+e
    
    data=data.frame(y,x,id,ti)
    
    # Get plm within R2
    reg=plm(y~x,model="within",index=c("id","ti"), effect = "twoways", data=data) 
    stargazer(reg, type = "text", 
              add.lines = list(c("Overall R2", round(r.squared(reg, model = "pooled"), 3)),
                               c("Between R2", round(r.squared(update(reg, effect = "individual", model = "between")), 3))))
    #> 
    #> ========================================
    #>                  Dependent variable:    
    #>              ---------------------------
    #>                           y             
    #> ----------------------------------------
    #> x                     1.128***          
    #>                        (0.113)          
    #>                                         
    #> ----------------------------------------
    #> Overall R2              0.337           
    #> Between R2              0.174           
    #> Observations             100            
    #> R2                      0.554           
    #> Adjusted R2             0.448           
    #> F Statistic    99.483*** (df = 1; 80)   
    #> ========================================
    #> Note:        *p<0.1; **p<0.05; ***p<0.01
    
    
    # I prefer huxreg, which is much more customizable!
    
    # Create a data frame of the R2 values
    r2s <- tibble(
      name = c("Overall R2", "Between R2"), 
      value = c(r.squared(reg, model = "pooled"), 
                r.squared(update(reg, effect = "individual", model = "between")))) 
    
    tab <- huxreg(reg) %>% 
      # Add new R2 values
      add_rows(hux(r2s), after = 4)
    
    # Rename R2 
    tab[7, 1] <- "Within R2"
    
    tab %>% huxtable::print_screen()
    #> ─────────────────────────────────────────────────
    #>                                    (1)           
    #>                         ─────────────────────────
    #>   x                                   1.128 ***  
    #>                                      (0.113)     
    #>                         ─────────────────────────
    #>   N                                 100          
    #>   Overall R2                          0.337      
    #>   Between R2                          0.174      
    #>   Within R2                           0.554      
    #> ─────────────────────────────────────────────────
    #>   *** p < 0.001; ** p < 0.01; * p < 0.05.        
    #> 
    #> Column names: names, model1
    

    Created on 2020-04-08 by the reprex package (v0.3.0)