Search code examples
c++cudathrust

Thrust: inplace exclusive_scan of one member


I have a vector of MyElement, which is defined as follows:

struct MyElement {
    int count;
    int prefixSum;
}

I would like to perform an in-place exclusive_scan of count but using prefixSum as the result, without changing count. Is that possible using thrust?

As an example, for the following input (prefixSum is initialized with zeros):

{ (0, 0), (0, 0), (1, 0), (2, 0), (0, 0), (1, 0), (0, 0), (1, 0), (0, 0), (1, 0) }

the correct output is:

{ (0, 0), (0, 0), (1, 0), (2, 1), (0, 3), (1, 3), (0, 4), (1, 4), (0, 5), (1, 5) }

namely count unchanged and prefixSum containing the exclusive prefix-sum of count.

This is what I've tried so far with thrust:

#include <thrust/scan.h>
#include <thrust/device_ptr.h>
#include <thrust/host_vector.h>
#include <thrust/device_vector.h>

struct MyElement {
    int count;
    int prefixSum;

    // Needed by thrust
    __host__ __device__ MyElement() {
        count = 0;
        prefixSum = 0;
    }

    __host__ __device__ MyElement(int a) {
        count = 0;
        prefixSum = 0;
    }

    // Used for initialization
    __host__ MyElement(int count, int prefixSum) {
        this->count = count;
        this->prefixSum = prefixSum;
    }

    __host__ __device__ friend MyElement operator +(const MyElement& a, const MyElement& b) {
        return MyElement(0, a.count + b.count + a.prefixSum + b.prefixSum);
    }

    //__host__ __device__ MyElement& operator=(const MyElement& other) {
    //    // check for self-assignment
    //    if (&other == this) {
    //        return *this;
    //    }

    //    count = other.count;
    //    prefixSum = other.prefixSum;
    //    return *this;
    //}
};

int main(int argc, char* argv[]) {
    thrust::device_vector<MyElement> d_vector;
    d_vector.push_back(MyElement(0, 0));
    d_vector.push_back(MyElement(0, 0));
    d_vector.push_back(MyElement(1, 0));
    d_vector.push_back(MyElement(2, 0));
    d_vector.push_back(MyElement(0, 0));
    d_vector.push_back(MyElement(1, 0));
    d_vector.push_back(MyElement(0, 0));
    d_vector.push_back(MyElement(1, 0));
    d_vector.push_back(MyElement(0, 0));
    d_vector.push_back(MyElement(1, 0));

    thrust::exclusive_scan(d_vector.data(), d_vector.data() + 10, d_vector.data());
    
    // Copy vector from device to host
    thrust::host_vector<MyElement> h_vector = d_vector;

    // Print
    for (const MyElement& element : h_vector) {
        printf("{ %d, %d }\n", element.count, element.prefixSum);
    }

    return 0;
}

With the above code, the correct value for prefixSum is computed, however count is lost (set to zero). I've tried many variations of the sum and assignment operators, but I couldn't find a correct solution.


Solution

  • I ended up with a similar solution. A StackOverflow answer describes how one can perform an inclusive prefix sum on just one class member.

    Since I can't perform a scan on one member and "output" it on another member, instead of initializing prefixSum to zero, I use the same value as count (I do this in a previous kernel, without using thrust). Hence my new input becomes:

    { (0, 0), (0, 0), (1, 1), (2, 2), (0, 0), (1, 1), (0, 0), (1, 1), (0, 0), (1, 1) }

    After that, I use thrust to perform an inclusive prefix sum on prefixSum only, getting:

    { (0, 0), (0, 0), (1, 1), (2, 3), (0, 3), (1, 4), (0, 4), (1, 5), (0, 5), (1, 6) }

    I actually need an exclusive prefix sum, however since I have the original count, the value I want is simply prefixSum - count.