I have an autocorrelation problem in my panel data. So I decided to use first difference method so deal with this problem.
Most of my independent variables are binary. So if I do the finite difference method over this, I get -1, 0, and 1 instead of 0 or 1 as before.
Is this ok?
Besides, my data set time flow is as follows which I am not sure how I can apply first difference method in this case when I have multiple difference incidents happening on the same day:
Date ID X Y Z L M A B C D E
01/01/2017 A 0 1 0 0 0 0 1 0 0 7.8
01/01/2017 A 0 1 0 0 0 1 0 0 1 6.5
01/01/2017 B 0 0 0 0 1 1 0 0 1 6.5
01/03/2017 A 0 1 0 0 0 0 0 0 0 7.8
01/04/2017 C 0 0 1 0 0 1 0 0 0 6.5
01/04/2017 C 0 0 0 0 0 0 1 0 0 7.3
I sort this again according to Date and ID which become as follows:
Date ID X Y Z L M A B C D E
01/01/2017 A 0 1 0 0 0 0 1 0 0 7.8
01/01/2017 A 0 1 0 0 0 1 0 0 1 6.5
01/01/2017 B 0 0 0 0 1 1 0 0 1 6.5
01/03/2017 A 0 1 0 0 0 0 0 0 0 7.8
01/04/2017 C 0 0 1 0 0 1 0 0 0 6.5
01/04/2017 C 0 0 0 0 0 0 1 0 0 7.3
Besides, Is this new data sorting ok to use in my Panel regression and also take the first difference over this utilizing this row sequence?
A regressor may be either time-invariant, or time-varying. For some estimators, notably the within and first differences estimators only the coefficients of time-varying regressors are identified (Cameron and Triverdi, Microeconometric Methods and Applications.). Some of your regressors seem to be time invariant.
You are not dealing with time series, but with panel or longitudinal data. Of course you have repeated ID and dates. That said, you need deal with autocorrelation with panel data tools like Arellano-Bond and Blundell-Bond estimators, to mention a few. See pgmm
in R plm package or xtdpdsys
or xtabond
in Stata.
If you have more than one variable identifying you panel id, than you can aggreagate it using: R create ID within a group. If you are working with Stata you could do:
egen id = group(sub_id_1 sub_id_2)
.