Search code examples
rloopsdataframemultiple-columnsquantmod

How to estimate Residuals of Autoregressive Model AR(2)with missing data for dataframe in R


I want to estimate/acquire residuals of AR(2) model. As residuals =Actual value- Fitted value.I have a huge dataframe with 4000 columns (company in each), time series. Now I want to run a AR(2) model and get the residuals from it.These columns are a liquidity measure of each company. Now I need to transform this liquidity measure that has been calculated for each company to be transformed by Auto Regressive model with 2 lags,so, that first auto correlation is removed.To transform the liquidity measure for each company using AR(2) process outlined in below equation and use residuals for subsequent analysis.

enter image description here

Where, Ct i is a measure of liquidity for stock i at month t, x is the number of lags included in the autoregressive process, and ut i is the residuals in liquidity for stock i at month t. I provide a small part of my data as below.

 DATE         A     B   C       D   E   F
31/12/1999  79.5    NA  NA      6   NA  NA
03/01/2000  79.5    NA  NA      6   NA  NA
04/01/2000  79.5    NA  325     6   961 3081.9
05/01/2000  79.5    NA  322.5   6   945 2524.7
06/01/2000  79.5    NA  327.5   6   952 3272.3
07/01/2000  79.5    NA  327.5   6   941 2102.9
10/01/2000  79.5    7   327.5   6   946 2901.5
11/01/2000  79.5    7   327.5   6   888 9442.5
12/01/2000  79.5    7   331.5   6   870 7865.8
13/01/2000  79.5    7   334     6   853 7742.1

I have from stats package found out this code as below. Could you please help specify this code,so, that it runs for each column(except date) and take care of missing values while having a lag of 2.

ar(x, aic = TRUE, order.max = NULL,
   method = c("yule-walker", "burg", "ols", "mle", "yw"),
   na.action, series, ...)

Solution

  • Assuming your data is called df you can add the residuals of AR(2) models for each column as follows:

    df <- data.frame(df, apply(df[-1], 2, function(x) arima(x, order = c(2,0,0))$res))
    df
           Date   Company1   Company2   Company1.1    Company2.1
    1  Jan-2000 0.05365700 0.01821019 -0.036876374  0.0006985845
    2  Feb-2000 0.07201239 0.01680506 -0.001093970 -0.0005063298
    3  Mar-2000 0.08740745 0.01924687 -0.003796628  0.0017217050
    4  Apr-2000 0.10866274 0.01792439  0.010745400  0.0007815183
    5  May-2000 0.14189171 0.01848372  0.032286719  0.0014418422
    6  Jun-2000 0.15228030 0.01472494  0.023719800 -0.0024127538
    7  Jul-2000 0.10231285 0.01634404 -0.025709302 -0.0016760248
    8  Aug-2000 0.10838209 0.01919361  0.019162611  0.0009089139
    9  Sep-2000 0.08358543 0.01624093 -0.022191184 -0.0010003877
    10 Oct-2000 0.10907866 0.01768522  0.022780332  0.0001924767
    

    Edited code:

    df <- data.frame(df, apply(df[-1], 2, function(x) arima(x[!is.na(x)], order = c(2,0,0), method = "ML")$res))