Search code examples
cudathrust

How to use thrust::copy_if using pointers


I am trying to copy non-zero elements of an array to a different array using pointers. I have tried implementing the solution in thrust copy_if: incomplete type is not allowed but I get zeros in my resultant array. Here is my code: This is the predicate functor:

struct is_not_zero
{
    __host__ __device__
    bool operator()( double x)
    {
        return (x != 0);
    }
};

And this is where the copy_if function is used:

double out[5];
thrust::device_ptr<double> output = thrust::device_pointer_cast(out);
    double *test1;

    thrust::device_ptr<double> gauss_res(hostResults1);

   thrust::copy_if(thrust::host,gauss_res, gauss_res+3,output, is_not_zero());

    test1 = thrust::raw_pointer_cast(output);
    
    for(int i =0;i<6;i++) {
        cout << test1[i] << " the number " << endl;
    }

where hostresult1 is the output array from a kernel.


Solution

  • You are making a variety of errors as discussed in the comments, and you've not provided a complete code so its not possible to state what all the errors are that you are making. Generally speaking you appear to be mixing up device and host activity, and pointers. These should generally be kept separate, and treated separately, in algorithms. The exception would be copying from device to host, but this can't be done with thrust::copy and raw pointers. You must use vector iterators or properly decorated thrust device pointers.

    Here is a complete example based on what you have shown:

    $ cat t66.cu
    #include <thrust/copy.h>
    #include <iostream>
    #include <thrust/device_ptr.h>
    struct is_not_zero
    {
        __host__ __device__
        bool operator()( double x)
        {
            return (x != 0);
        }
    };
    
    
    int main(){
        const int ds = 5;
        double *out, *hostResults1;
        cudaMalloc(&out, ds*sizeof(double));
        cudaMalloc(&hostResults1, ds*sizeof(double));
        cudaMemset(out, 0, ds*sizeof(double));
        double test1[ds];
        for (int i = 0; i < ds; i++) test1[i] = 1;
        test1[3] = 0;
        cudaMemcpy(hostResults1, test1, ds*sizeof(double), cudaMemcpyHostToDevice);
        thrust::device_ptr<double> output = thrust::device_pointer_cast(out);
    
        thrust::device_ptr<double> gauss_res(hostResults1);
    
        thrust::copy_if(gauss_res, gauss_res+ds,output, is_not_zero());
        cudaMemcpy(test1, out, ds*sizeof(double), cudaMemcpyDeviceToHost);
        for(int i =0;i<ds;i++) {
            std::cout << test1[i] << " the number " << std::endl;
        }
    }
    $ nvcc -o t66 t66.cu
    $ ./t66
    1 the number
    1 the number
    1 the number
    1 the number
    0 the number