Search code examples
matlabtime-seriesautoregressive-models

Fast approach in matlab to estimate linear regression with AR terms


I am trying to estimate regression and AR parameters for (loads of) linear regressions with AR error terms. (You could also think of this as a MA process with exogenous variables):

hqll , where
AR , with lags of length p

I am following the official matlab recommendations and use regArima to set up a number of regressions and extract regression and AR parameters (see reproducible example below).

The problem: regArima is slow! For 5 regressions, matlab needs 14.24sec. And I intend to run a large number of different regression models. Is there any quicker method around?

y = rand(100,1);
r2 = rand(100,1);
r3 = rand(100,1);
r4 = rand(100,1);
r5 = rand(100,1);
exo = [r2 r3 r4 r5];

tic
 for p = 0:4
     Mdl = regARIMA(3,0,0);
     [EstMdl, ~, LogL] = estimate(Mdl,y,'X',exo,'Display','off');

 end
toc

Solution

  • Unlike the regArima function which uses Maximum Likelihood, the Cochrane-Orcutt prodecure relies on an iteration of OLS regression. There are a few more particularities when this approach is valid (refer to the link posted). But for the aim of this question, the appraoch is valid, and fast!

    I modified James Le Sage's code which covers only AR lags of order 1, to cover lags of order p.

    function result = olsc(y,x,arterms) 
    % PURPOSE: computes Cochrane-Orcutt ols Regression for AR1 errors 
    %--------------------------------------------------- 
    % USAGE: results = olsc(y,x) 
    % where: y = dependent variable vector (nobs x 1) 
    %        x = independent variables matrix (nobs x nvar) 
    %--------------------------------------------------- 
    % RETURNS: a structure 
    %        results.meth  = 'olsc' 
    %        results.beta  = bhat estimates 
    %        results.rho   = rho estimate 
    %        results.tstat = t-stats 
    %        results.trho  = t-statistic for rho estimate 
    %        results.yhat  = yhat 
    %        results.resid = residuals 
    %        results.sige  = e'*e/(n-k) 
    %        results.rsqr  = rsquared 
    %        results.rbar  = rbar-squared 
    %        results.iter  = niter x 3 matrix of [rho converg iteration#] 
    %        results.nobs  = nobs 
    %        results.nvar  = nvars 
    %        results.y     = y data vector 
    % -------------------------------------------------- 
    % SEE ALSO: prt_reg(results), plt_reg(results) 
    %--------------------------------------------------- 
    
    % written by: 
    % James P. LeSage, Dept of Economics 
    % University of Toledo 
    % 2801 W. Bancroft St, 
    % Toledo, OH 43606 
    % [email protected] 
    
    % do error checking on inputs 
    if (nargin ~= 3); error('Wrong # of arguments to olsc'); end; 
    
    [nobs nvar] = size(x); 
    [nobs2 junk] = size(y); 
    
    if (nobs ~= nobs2); error('x and y must have same # obs in olsc'); end; 
    
    % ----- setup parameters 
    ITERMAX = 100; 
    converg = 1.0; 
    rho = zeros(arterms,1); 
    iter = 1; 
    % xtmp = lag(x,1); 
    % ytmp = lag(y,1); 
    
    % truncate 1st observation to feed the lag 
    % xlag = x(1:nobs-1,:); 
    % ylag = y(1:nobs-1,1); 
    yt = y(1+arterms:nobs,1); 
    xt = x(1+arterms:nobs,:); 
    
    xlag = zeros(nobs-arterms,arterms);
    for tt = 1 : arterms
    xlag(:,nvar*(tt-1)+1:nvar*(tt-1)+nvar) = x(arterms-tt+1:nobs-tt,:);
    end
    
    ylag = zeros(nobs-arterms,arterms);
    for tt = 1 : arterms
    ylag(:,tt) = y(arterms-tt+1:nobs-tt,:);
    end
    
    % setup storage for iteration results 
    iterout = zeros(ITERMAX,3); 
    
    while (converg > 0.0001) & (iter < ITERMAX), 
    % step 1, using intial rho = 0, do OLS to get bhat 
     ystar = yt - ylag*rho; 
     xstar = zeros(nobs-arterms,nvar);
     for ii = 1 : nvar
        tmp = zeros(1,arterms);
        for tt = 1:arterms
            tmp(1,tt)=ii+nvar*(tt-1);
        end       
        xstar(:,ii) = xt(:,ii) - xlag(:,tmp)*rho; 
     end
    
    beta = (xstar'*xstar)\xstar' * ystar; 
    
    e = y - x*beta; 
    
    % truncate 1st observation to account for the lag 
    et = e(1+arterms:nobs,1);
    elagt = zeros(nobs-arterms,arterms);
    for tt = 1 : arterms
        elagt(:,tt) = e(arterms-tt+1:nobs-tt,:);
    end
    
    % step 2, update estimate of rho using residuals 
    %         from step 1 
    
    res_rho = (elagt'*elagt)\elagt' * et;  
    rho_last = rho; 
    rho = res_rho; 
    converg = sum(abs(rho - rho_last)); 
    
    % iterout(iter,1) = rho; 
    iterout(iter,2) = converg; 
    iterout(iter,3) = iter; 
    
    iter = iter + 1; 
    
    end; % end of while loop 
    
    if iter == ITERMAX 
    %  error('ols_corc did not converge in 100 iterations'); 
      print('ols_corc did not converge in 100 iterations'); 
    
    end; 
    
    result.iter= iterout(1:iter-1,:); 
    
    % after convergence produce a final set of estimates using rho-value 
     ystar = yt - ylag*rho; 
     xstar = zeros(nobs-arterms,nvar);
     for ii = 1 : nvar
        tmp = zeros(1,arterms);
        for tt = 1:arterms
            tmp(1,tt)=ii+nvar*(tt-1);
        end       
        xstar(:,ii) = xt(:,ii) - xlag(:,tmp)*rho; 
     end
    
    result.beta = (xstar'*xstar)\xstar' * ystar; 
    e = y - x*result.beta; 
    
    et = e(1+arterms:nobs,1);
    elagt = zeros(nobs-arterms,arterms);
    for tt = 1 : arterms
        elagt(:,tt) = e(arterms-tt+1:nobs-tt,:);
    end
    
    u = et - elagt*rho;
    result.vare = std(u)^2;
    
    result.meth = 'olsc'; 
    result.rho = rho; 
    result.iter = iterout(1:iter-1,:); 
    % % compute t-statistic for rho 
    % varrho = (1-rho*rho)/(nobs-2); 
    % result.trho = rho/sqrt(varrho); 
    

    (I did not adapt in the last 2 lines the t-test for rho vectors of length p, but this should be straight forward to do..)