Search code examples
rstatapanel-dataplm

Calculating within, between or overall R-square in R


I'm migrating from Stata to R (plm package) in order to do panel model econometrics. In Stata, panel models such as random effects usually report the within, between and overall R-squared.

I have found that the reported R-squared in the plm Random Effects models corresponds to the within R squared. So, is there any way to get the overall and between R-squared using the plm package in R?

See same example with R and Stata:

library(plm)
library(foreign) # read Stata files
download.file('http://fmwww.bc.edu/ec-p/data/wooldridge/wagepan.dta','wagepan.dta',mode="wb")
wagepan <- read.dta('wagepan.dta')

# Random effects
plm.re <- plm(lwage ~ educ + black + hisp + exper + expersq + married + union + d81 + d82 + d83 + d84 + d85 + d86 + d87,
              data=wagepan,
              model='random',
              index=c('nr','year'))
summary(plm.re)

In Stata:

use http://fmwww.bc.edu/ec-p/data/wooldridge/wagepan.dta
xtset nr year
xtreg lwage educ  black  hisp  exper  expersq  married  union  d81  d82  d83  d84  d85  d86  d87, re

The R-squared reported in R (0.18062) is, at least in this case, similar to the R-sq Within reported in Stata (0.1799). Is there any way to get in R the R-sq Between (0.1860) and overall (0.1830) reported in Stata?


Solution

  • this website has the complete code to reproduce Example 14.4 in Wooldridge 2013 p. 494-5 with R-sq. reported for all models,

    # install.packages(c("wooldridge"), dependencies = TRUE) 
    # devtools::install_github("JustinMShea/wooldridge")
    library(wooldridge) 
    data(wagepan)
    
    # install.packages(c("plm", "stargazer","lmtest"), dependencies = TRUE)
    library(plm); library(lmtest); library(stargazer)
    
    model <- as.formula("lwage ~ educ + black + hisp + exper+I(exper^2)+married + union+yr")
    reg.ols <- plm(model, data = wagepan.p, model="pooling")
    
    reg.re <- plm(lwage ~ educ + black + hisp + exper +
                  I(exper^2) + married + union + yr, data = wagepan.p, model="random") 
    
    reg.fe <- plm(lwage ~ I(exper^2) + married+union+yr, data=wagepan.p, model="within")
    
    # Pretty table of selected results (not reporting year dummies)
    stargazer(reg.ols,reg.re,reg.fe, type="text",
         column.labels=c("OLS","RE","FE"),
         keep.stat=c("n","rsq"),
         keep=c("ed","bl","hi","exp","mar","un"))
    

    which outputs,

    #> ==========================================
    #>                   Dependent variable:     
    #>              -----------------------------
    #>                          lwage            
    #>                 OLS       RE        FE    
    #>                 (1)       (2)       (3)   
    #> ------------------------------------------
    #> educ         0.091***  0.092***           
    #>               (0.005)   (0.011)           
    #>                                           
    #> black        -0.139*** -0.139***          
    #>               (0.024)   (0.048)           
    #>                                           
    #> hisp           0.016     0.022            
    #>               (0.021)   (0.043)           
    #>                                           
    #> exper        0.067***  0.106***           
    #>               (0.014)   (0.015)           
    #>                                           
    #> I(exper2)    -0.002*** -0.005*** -0.005***
    #>               (0.001)   (0.001)   (0.001) 
    #>                                           
    #> married      0.108***  0.064***   0.047** 
    #>               (0.016)   (0.017)   (0.018) 
    #>                                           
    #> union        0.182***  0.106***  0.080*** 
    #>               (0.017)   (0.018)   (0.019) 
    #>                                           
    #> ------------------------------------------
    #> Observations   4,360     4,360     4,360  
    #> R2             0.189     0.181     0.181  
    #> ==========================================
    #> Note:          *p<0.1; **p<0.05; ***p<0.01