Search code examples
c++cudathrust

CUDA Thrust Min_Element result equals 0


Very new to CUDA and C++ however have been working away at some problems I have noticed. I am wanting to generate the smallest number as well as the index in CUDA. Currently I have

    __global__ void updateGB2(Particle *dev_p) {
    int i = threadIdx.x + blockIdx.x *blockDim.x;

    globalB[i] = dev_p[i].localBest;

    double *result = thrust::min_element(thrust::device,globalB, globalB + pSize);
    printf("%lf", result);
}

And this method is being called, however the result is only printing 0.0000. I am probably missing some information for using thrust but from the information I have read I am unsure what else to do. globalB is defined as device and Particle is passed from CPU to GPU.


Solution

  • Quoting from the thrust documentation

    min_element finds the smallest element in the range [first, last). It returns the first iterator i in [first, last) such that no other iterator in [first, last) points to a value smaller than *i.

    In your code, that means that result is a pointer which must be de-referenced in order to access the minimum value. A complete example:

    #include <cstdio>
    #include <thrust/device_vector.h>
    #include <thrust/extrema.h>
    #include <thrust/copy.h>
    
    __global__ void updateGB2(double *data, int pSize) {
        int i = threadIdx.x + blockIdx.x *blockDim.x;
    
        double* globalB = data + (i * pSize);
        double* result = thrust::min_element(thrust::device, globalB, globalB + pSize);
        printf("%d %lf\n", i, *result);
    }
    
    int main() 
    {
        const int pSize = 16;
        const int Nvectors = 32;
        const int Nvals = Nvectors * pSize;
    
        {
            thrust::device_vector<double> dv(Nvals);
    
            thrust::counting_iterator<double> counter(10);
            thrust::copy(counter, counter+Nvals, dv.begin());
    
            double* d_h = thrust::raw_pointer_cast(dv.data());
            updateGB2<<<1, Nvectors>>>(d_h, pSize);
            cudaDeviceSynchronize();
        }
        cudaDeviceReset();
    
        return 0;
    }
    

    which compiles and runs like so:

    $ nvcc -arch=sm_52 -o thrustdevice thrustdevice.cu 
    $ ./thrustdevice 
    0 10.000000
    1 26.000000
    2 42.000000
    3 58.000000
    4 74.000000
    5 90.000000
    6 106.000000
    7 122.000000
    8 138.000000
    9 154.000000
    10 170.000000
    11 186.000000
    12 202.000000
    13 218.000000
    14 234.000000
    15 250.000000
    16 266.000000
    17 282.000000
    18 298.000000
    19 314.000000
    20 330.000000
    21 346.000000
    22 362.000000
    23 378.000000
    24 394.000000
    25 410.000000
    26 426.000000
    27 442.000000
    28 458.000000
    29 474.000000
    30 490.000000
    31 506.000000