I am trying to estimate a model using fixed effects in R using plm package. My data looks like the following below it is firm, city, year, quarter level. And each of these I observe sales, and income by firm and city level by year-quarter. My regression is income ~ sales. That is sales on income, but looking to control for firm, and city specific unobservables. I have 1000+ firms in my actual dataset.
fid = c(1,1,1,1,
2,2,2,2,
3,3,3,3,3,3,3,3,
4,4,4,4,5,5,5,5,
5,5,5,5)
cityid = c(101,101,101,101,
102,102,102,102,102,102,102,102,103,103,103,103,
103,103,103,103,
104,104,104,104,
104,104,104,104)
year = c(2000, 2000, 2000, 2000,2000,2000, 2000,2000,2001,2001,2001,2001,2002,2002,2002,2002,
2001,2001,2001,2001,2001,2001,2001,2001,2002,2002,2002,2002)
qtr = c(1,2,3,4,1,2,3,4,1,2,3,
4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4)
df = data.frame(fid, cityid,year,qtr,sales = sample(1:4,7, replace=T),income=30:57)
I see the plm function takes in panel specified by individual-time. That is each individual is observed over various time intervals. Now how could I use the plm package to run: 1.) firm fixed effects 2.) firm and city fixed effects 3.) firm, city, quarter fixed effects.
Could you distinguish? I am little confused regarding the time component, and wondering if I can use firm and city fixed effects too? In running the firm and city fixed effects, my panel would have each firm city repeated 4 times for the quarter, while each city may have multiple firms.
For 3.) can I combine firm, city using the plm command but explicitly control for quarter in the formula (like factor(quarter))?
Just wanted to get a clearer understanding of extending plm to estimate fixed effects, beyond just using time dimensions. I have already looked the vignette, but it is not totally clear. So any information would be great.
I think you are a bit confused here. The unit of analysis in your dataset is the yearly quarter (lets call it q_year, coded for example as 2000_1, 2000_2, etc.). So you would want to generate such a variable and use it to index the time dimension.
This you then could specify as follows:
model <- plm(income ~ sales + as.factor(q_year), data= df, index=c("fid", "q_year"),
model="within")
summary(model)
This model gives you time-fixed effects (yearly quarter) as well as firm-fixed effects. Note, that in your example data 'city' does not vary over time. So it would be consumed by the firm-fixed effect (the city location is a fixed firm characteristic!).
(note: do you have data for some firms ranging over multiple years? Your example data does not have this. You would want to condens your example data to a four wave design and just take the quaters as time dimension, because this data structure effectively hold year constant for every firm.)