I'm doing a non-linear regression in R and want to add one moving-average term to my model to eliminate the autocorrelations in residuals.
Basically, here is the model:
y[n] = a + log((x1[n])^g + (x2[n])^g) + c*e[n-1] + e[n]
where [e]
is the moving average term.
I plan to use ARIMA(0, 0, 1)
to model residuals. However, I do not know which function I should use in R to add non-linear exogenous part to ARIMA model.
More information: I know how to use nls
command to estimate a
and g
, but do not know how to deal with e[n]
.
I know that xreg
in arima
can handle ARIMA model with linear exogenous variables. Is there a similar function to handle ARIMA model with nonlinear exogenous variables?
Thank you for the help in advance!
nlme
has such capability, as it is fitting non-linear mixed models. You can think of it an extension to nls
(a fixed-effect only non-linear regression), by allowing random effect and correlated errors.
nlme
can handle ARMA correlation, by something like correlation = corARMA(0.2, ~ 1, p = 0, q = 1, fixed = TRUE)
. This means, that residuals are MA(1) process, with initial guess of coefficient 0.2, but to be updated during model fitting. The ~ 1
suggests that MA(1)
is on intercept and there is no further grouping structure.
I am not an expert in nlme
, but I know nlme
is what you need. I produce the following example, but since I am not an expert, I can't get nlme
work at the moment. I post it here to give a start / flavour.
set.seed(0)
x1 <- runif(100)
x2 <- runif(100)
## MA(1) correlated error, with innovation standard deviation 0.1
e <- arima.sim(model = list(ma = 0.5), n = 100, sd = 0.1)
## a true model, with `a = 0.2, g = 0.5`
y0 <- 0.2 + log(x1 ^ 0.5 + x2 ^ 0.5)
## observations
y <- y0 + e
## no need to install; it comes with R; just `library()` it
library(nlme)
fit <- nlme(y ~ a + log(x1 ^ g + x2 ^ g), fixed = a + g ~ 1,
start = list(a = 0.5, g = 1),
correlation = corARMA(0.2, form = ~ 1, p = 0, q = 1, fixed = FALSE))
Similar to nls
, we have an overall model formula y ~ a + log(x1 ^ g + x2 ^ g)
, and starting values are required for iteration process. I have chosen start = list(a = 0.5, g = 1)
. The correlation
bit has been explained in the beginning.
fixed
and random
arguments in nlme
specify what should be seen as fixed effects and random effects in the overall formula. Since we have no random effect, we leave it unspecified. We want a
and g
as fixed effect, so I tried something like fixed = a + g ~ 1
. Unfortunately it does not quite work, for some reason I don't know. I read the ?nlme
, and thought this formula means that we want a common a
and g
for all observations, but later nlme
reports an error saying this is not a valid group formula.
I am also investing at this; as I said, the above gives us a start. We are already fairly close to the final answer.
Thanks to user20650 for point out my awkward error. I should use gnls
function rather than nlme
. By design nature of nlme
package, functions lme
and nlme
have to take a random
argument to work. Luckily, there are several other routines in nlme
package for extending linear models and non-linear models.
gls
and gnls
extend lm
and nls
by allowing non-diagonal variance functions.So, I should really use gnls
instead:
## no `fixed` argument as `gnls` is a fixed-effect only
fit <- gnls(y ~ a + log(x1 ^ g + x2 ^ g), start = list(a = 0.5, g = 1),
correlation = corARMA(0.2, form = ~ 1, p = 0, q = 1, fixed = FALSE))
#Generalized nonlinear least squares fit
# Model: y ~ a + log(x1^g + x2^g)
# Data: NULL
# Log-likelihood: 92.44078
#
#Coefficients:
# a g
#0.1915396 0.5007640
#
#Correlation Structure: ARMA(0,1)
# Formula: ~1
# Parameter estimate(s):
# Theta1
#0.4184961
#Degrees of freedom: 100 total; 98 residual
#Residual standard error: 0.1050295