Search code examples
rpanel-data

Diff in Diff with panel dataset on R


I have a panel dataset that I'd like to conduct diff in diff on. Right now this is my regression:

fit3 <- glm(df$empstat ~ factor(year) + factor(stateicp) + migrant_category + treated*post + treated*migrant_category
           + post*migrant_category + treated*post*migrant_category + race + educ + age +
             marst, data = df, weights = perwt, family = 'gaussian'
)

but will this make R assume that each observation is independent of each other? If yes, what should I do to make R realize that this is a panel data?


Solution

  • If you are interested in fixed effects models and difference in difference, use the plm package. Here is an example from Christopher Zorn:

    # Panel data 
    WDI<-read_csv("https://github.com/PrisonRodeo/GSERM-Ljubljana-APD-git/raw/main/Data/WDI3.csv")
    
    # Add a "Cold War" variable:
    WDI$ColdWar <- with(WDI,ifelse(Year<1990,1,0))
    
    # Keep a numeric year variable (for -panelAR-):
    WDI$YearNumeric<-WDI$Year
    
    # Make the data a panel dataframe:
    WDI<-pdata.frame(WDI,index=c("ISO3","Year"))
    
    # Pull out *only* those countries that, at some
    # point during the observed periods, instituted
    # a paid parental leave policy:
    WDI<-WDI %>% group_by(ISO3) %>%
      filter(any(PaidParentalLeave==1))
    
    # Create a better trend variable:
    WDI$Time<-WDI$YearNumeric-1950
    
    # FE models...
    
    fe1<-plm(ChildMortality~PaidParentalLeave+Time+
                      PaidParentalLeave*Time,data=WDI,
                    effect="individual",model="within")
    
    fe2<-plm(ChildMortality~PaidParentalLeave+Time+
                      PaidParentalLeave*Time+log(GDPPerCapita)+
                      log(NetAidReceived)+GovtExpenditures,
                    data=WDI,effect="individual",model="within")
    
    fe3<-plm(ChildMortality~PaidParentalLeave+Time+
                      PaidParentalLeave*Time,data=WDI,
                    effect="twoway",model="within")
    
    fe4<-plm(ChildMortality~PaidParentalLeave+Time+
                      PaidParentalLeave*Time+log(GDPPerCapita)+
                      log(NetAidReceived)+GovtExpenditures,
                    data=WDI,effect="twoway",model="within")
    
    # TABLE TIME
    
    stargazer(fe1,fe2,fe3,fe4,
              title="DiD Models of log(Child Mortality)",
              column.separate=c(1,1,1),align=TRUE,
              dep.var.labels.include=FALSE,
              dep.var.caption="",
              covariate.labels=c("Paid Parental Leave","Time (1950=0)",
                                               "Paid Parental Leave x Time",
                                               "ln(GDP Per Capita)",
                                               "ln(Net Aid Received)",
                                               "Government Expenditures"),
                            header=FALSE,model.names=FALSE,
                            model.numbers=FALSE,multicolumn=FALSE,
                            object.names=TRUE,notes.label="",
                            column.sep.width="-15pt",
                            omit.stat=c("f","ser"),type="text")
    
    DiD Models of log(Child Mortality)
    =====================================================================
                                  fe1        fe2        fe3        fe4   
    ---------------------------------------------------------------------
    Paid Parental Leave        -15.500*** -26.200*** -12.500*** -17.300* 
                                (2.420)    (7.220)    (2.960)    (9.360) 
                                                                         
    Time (1950=0)              -0.838***  -1.480***                      
                                (0.025)    (0.094)                       
                                                                         
    Paid Parental Leave x Time            -7.110***              -4.910* 
                                           (2.290)               (2.600) 
                                                                         
    ln(GDP Per Capita)                    -1.780***             -3.020***
                                           (0.471)               (0.552) 
                                                                         
    ln(Net Aid Received)                   0.873***             0.842*** 
                                           (0.139)               (0.146) 
                                                                         
    Government Expenditures     0.310***   0.524***   0.247***   0.319*  
                                (0.044)    (0.128)    (0.056)    (0.169) 
                                                                         
    ---------------------------------------------------------------------
    Observations                 2,360       622       2,360       622   
    R2                           0.496      0.717      0.009      0.143  
    Adjusted R2                  0.485      0.701      -0.035     0.014  
    =====================================================================
                                              *p<0.1; **p<0.05; ***p<0.01