Given a line function y = a*x + b
(a
and b
are previously known constants), it is easy to calculate the sum-of-squares distance between the line and a window of samples (1, Y1), (2, Y2), ..., (n, Yn)
(where Y1
is the oldest sample and Yn
is the newest):
sum((Yx - (a*x + b))^2 for x in 1,...,n)
I need a fast algorithm for calculating this value for a rolling window (of length n
) - I cannot rescan all the samples in the window every time a new sample arrives.
Obviously, some state should be saved and updated for every new sample that enters the window and every old sample leaves the window.
Notice that when a sample leaves the window, the indecies of the rest of the samples change as well - every Yx becomes Y(x-1). Therefore when a sample leaves the window, every other sample in the window contribute a different value to the new sum: (Yx - (a*(x-1) + b))^2
instead of (Yx - (a*x + b))^2
.
Is there a known algorithm for calculating this? If not, can you think of one? (It is ok to have some mistakes due to first-order linear approximations).
If you expand the term (Yx - (a*x + b))^2
the terms break into three parts:
a
,x
and b
. These produce some constant when summed over n
and can be ignored.Yx
and b
. These can be handled in the style of a boxcar integrator as @Xion described.-2*Yx*a*x
. The -2*a
is a constant so ignore that part. Consider the partial sum S = Y1*1 + Y2*2 + Y3*3 ... Yn*n
. Given Y1
and a running sum R = Y1 + Y2 + ... + Yn
you can find S - R
which eliminates Y1*1
and reduces each of the other terms, leaving you with Y2*1 + Y3*2 + ... + Yn*(n-1)
. Now update the running sum R
as for (2) by subtracting off Y1
and adding Y(n+1)
. Add the new Yn*n
term to S
.Now just add up all those partial terms.