Search code examples
algorithmmatlaboptimizationstatisticstime-series

Moving window Autocorrelation. One pass algorithm. Speed and Stability concerns


Helllo,

I have a timeseries x for which I want to generate a 'moving window autocorrelation at lag 1' series using a window of size window_size and a step of size window_step. I have written the following one pass algorithm code (in MATLAB) for evaluating AC at lag 1:

function AC_timeseries = rolling_autocorr_lag1(x, window_size, window_step)
   
    N = length(x);
    
    sum_x = zeros(1, N);
    sum_x(1) = x(1);
    sum_xy = zeros(1, N);

    for n = 2: N
        sum_x(n) = sum_x(n - 1) + x(n);
        sum_xy(n) = sum_xy(n - 1) + x(n) * x(n - 1);
    end
    
    % Get the indices at which to calculate autocorrelation lag 1 values
    window_ends_idx = window_size: window_step: length(x);
    
    % Preallocate array for storing autocorrelation lag 1 time series
    AC_timeseries = zeros(1, length(window_ends_idx));

    % Generate autocorrelation lag 1 time series
    for i = 1: length(window_ends_idx)
        idx = window_ends_idx(i);
        if i == 1
            sum_use = sum_x(idx);
        else
            sum_use = sum_x(idx) - sum_x(idx - window_size);
        end
        sum_x_lag1 = sum_use - x(idx);
        sum_x_fwd1 = sum_use - x(idx - window_size + 1);
        sum_xy_window = sum_xy(idx) - sum_xy(idx - window_size + 1);
        std_dev_window = std(x(idx - window_size + 1: idx));
        N = window_size - 1;
        % Calculate autocorrelation lag 1 for the window
        AC_timeseries(i) = (N * sum_xy_window - sum_x_lag1 * sum_x_fwd1) / (N * std_dev_window * N * std_dev_window);
    end

end

My Concerns:

I would first like to know if this code can be improved further to speed up computations. Also, if a better method is available please let me know. Also, I am sceptical that the variable sum_xy can overflow. How can I tackle this issue.

Is normalization of the data something that could help here? Can I make changes to the calculations/assignments to make this faster?

Reading online gives me the hint that extending matlab using C++ could give me speed gains here. (Unfortunately, I don't know C++). If this is possible please provide relevant code.

Edit1: Example inputs:

n = 10000000;
timeseries = randi(1000, 1, n);

window_size = floor(n * 0.03);
window_step = floor(window_size * (1 / 100));

AC_timeseries = rolling_autocorr_lag1(timeseries, window_size, window_step);

Solution

  • I posted this question on the MATLAB forum as well. I got a great solution from user @BrunoLuong. Which can be found here.

    The altered code is ~20x faster! I am putting his code below as well.

    function AC_timeseries = rolling_autocorr_lag1_BLU(x, window_size, window_step)
        sum_x = [0 cumsum(x)];
        sum_x2 = [0 cumsum(x.^2)];
        sum_xy = [0, cumsum(x(2:end).*x(1:end-1))];
        
        % Get the indices at which to calculate autocorrelation lag 1 values
        window_ends_idx = window_size: window_step: length(x);
        
        % Generate autocorrelation lag 1 time series
        N = window_size - 1;
        idx = window_ends_idx;
        iwin1 = idx - window_size + 1;
        sum_use = sum_x(idx+1) - sum_x(iwin1);
        sum_x_lag1 = sum_use - x(idx);
        sum_x_fwd1 = sum_use - x(iwin1);
        sum_xy_window = sum_xy(idx) - sum_xy(iwin1);
        sum_x2_window = sum_x2(idx+1) - sum_x2(iwin1);
        var_dev_window = (sum_x2_window - sum_use.^2/window_size); % N * square of std_dev_window
        % Calculate autocorrelation lag 1 for the window
        AC_timeseries = (N*sum_xy_window - sum_x_lag1.*sum_x_fwd1) ./ (N * var_dev_window);
    end