I'm currently trying to use CUDA to evaluate a difference equation that represents an exponential moving average filter. The filter is described by the following difference equation
y[n] = y[n-1] * beta + alpha * x[n]
where alpha
and beta
are constants defined as
alpha = (2.0 / (1 + Period))
beta = 1 - alpha
How can the above difference equation be manipulated to get the system response for this filter? What would be an efficient way to implement this filter on the GPU?
I am working on a GTX 570.
I would propose to manipulate the above difference equation as indicated below and then using CUDA Thrust primitives.
DIFFERENCE EQUATION MANIPULATION - EXPLICIT FORM OF THE DIFFERENCE EQUATION
By simple algebra, you can find the following:
y[1] = beta * y[0] + alpha * x[1]
y[2] = beta^2 * y[0] + alpha * beta * x[1] + alpha * x[2]
y[3] = beta^3 * y[0] + alpha * beta^2 * x[1] + alpha * beta * x[2] + alpha * x[3]
Accordingly, the explicit form is the following:
y[n] / beta^n = y[0] + alpha * x[1] / beta + alpha * x[2] / beta^2 + ...
CUDA THRUST IMPLEMENTATION
You can implement the above explicit form by the following steps:
d_input
to alpha
except for d_input[0] = 1.
;d_1_over_beta_to_the_n
equal to 1, 1/beta, 1/beta^2, 1/beta^3, ...
;d_input
by d_1_over_beta_to_the_n
;inclusive_scan
to obtain the sequence of the y[n] / beta^n
;1, 1/beta, 1/beta^2, 1/beta^3, ...
.EDIT
The above approach can be recommended for Linear Time-Varying (LTV) systems. For Linear Time-Invariant (LTI) systems, the FFT approach mentioned by Paul can be recommended. I'm providing an example of that approach by using CUDA Thrust and cuFFT in my answer to FIR filter in CUDA.
FULL CODE
#include <thrust/sequence.h>
#include <thrust/device_vector.h>
#include <thrust/host_vector.h>
int main(void)
{
int N = 20;
// --- Filter parameters
double alpha = 2.7;
double beta = -0.3;
// --- Defining and initializing the input vector on the device
thrust::device_vector<double> d_input(N,alpha * 1.);
d_input[0] = d_input[0]/alpha;
// --- Defining the output vector on the device
thrust::device_vector<double> d_output(d_input);
// --- Defining the {1/beta^n} sequence
thrust::device_vector<double> d_1_over_beta(N,1./beta);
thrust::device_vector<double> d_1_over_beta_to_the_n(N,1./beta);
thrust::device_vector<double> d_n(N);
thrust::sequence(d_n.begin(), d_n.end());
thrust::inclusive_scan(d_1_over_beta.begin(), d_1_over_beta.end(), d_1_over_beta_to_the_n.begin(), thrust::multiplies<double>());
thrust::transform(d_1_over_beta_to_the_n.begin(), d_1_over_beta_to_the_n.end(), d_input.begin(), d_input.begin(), thrust::multiplies<double>());
thrust::inclusive_scan(d_input.begin(), d_input.end(), d_output.begin(), thrust::plus<double>());
thrust::transform(d_output.begin(), d_output.end(), d_1_over_beta_to_the_n.begin(), d_output.begin(), thrust::divides<double>());
for (int i=0; i<N; i++) {
double val = d_output[i];
printf("Device vector element number %i equal to %f\n",i,val);
}
// --- Defining and initializing the input vector on the host
thrust::host_vector<double> h_input(N,1.);
// --- Defining the output vector on the host
thrust::host_vector<double> h_output(h_input);
h_output[0] = h_input[0];
for(int i=1; i<N; i++)
{
h_output[i] = h_input[i] * alpha + beta * h_output[i-1];
}
for (int i=0; i<N; i++) {
double val = h_output[i];
printf("Host vector element number %i equal to %f\n",i,val);
}
for (int i=0; i<N; i++) {
double val = h_output[i] - d_output[i];
printf("Difference between host and device vector element number %i equal to %f\n",i,val);
}
getchar();
}