I have a simple issue after running a regression with panel data using plm
with a dataset that resembles the one below:
dataset <- data.frame(id = rep(c(1,2,3,4,5), 2),
time = rep(c(0,1), each = 5),
group = rep(c(0,1,0,0,1), 2),
Y = runif(10,0,1))
model <-plm(Y ~ time*group, method = 'fd', effect = 'twoways', data = dataset,
index = c('id', 'time'))
summary(model)
stargazer(model)
As you can see, both the model summary
and the table displayed by stargazer
would say that my number of observations is 10. However, is it not more correct to say that N = 5
, since I have taken away the time element after with the first differences?
You are right about the number of observations. However, your code does not what you want it to do (a first differenced model).
If you want a first differenced model, switch the argument method
to model
(and delete argument effect
because it does not make sense for a first differenced model):
model <-plm(Y ~ time*group, model = 'fd', data = dataset,
index = c('id', 'time'))
summary(model)
## Oneway (individual) effect First-Difference Model
##
## Call:
## plm(formula = Y ~ time * group, data = dataset, model = "fd",
## index = c("id", "time"))
##
## Balanced Panel: n = 5, T = 2, N = 10
## Observations used in estimation: 5
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -0.3067240 -0.0012185 0.0012185 0.1367080 0.1700160
## [...]
In the summary output, you can see the number of observations in your original data (N=10) and the number of observations used in the FD model (5).