How to dynamically set the size of device_vectors in thrust set operations?

I have two sets A & B. The result(C) of my operation should have elements in A which are not there in B. I use set_difference to do it. However the size of result(C) has to be set before the operation. Else it has extra zeros at the end, like below:

A=
1 2 3 4 5 6 7 8 9 10 
B=
1 2 8 11 7 4 
C=
3 5 6 9 10 0 0 0 0 0

How to set the size of result(C) dynamically so that output is C= 3 5 6 9. In a real problem, I would not know the required size of result device_vector apriori.

My code:

#include <thrust/execution_policy.h>
#include <thrust/set_operations.h>
#include <thrust/sequence.h>
#include <thrust/execution_policy.h>
#include <thrust/device_vector.h>


void remove_common_elements(thrust::device_vector<int> A, thrust::device_vector<int> B, thrust::device_vector<int>& C)
{

thrust::sort(thrust::device, A.begin(), A.end());
thrust::sort(thrust::device, B.begin(), B.end());

thrust::set_difference(thrust::device, A.begin(), A.end(), B.begin(), B.end(), C.begin());
}

int main(int argc, char * argv[])
{

thrust::device_vector<int> A(10);
thrust::sequence(thrust::device, A.begin(), A.end(),1);  // x components of the 'A' vectors

thrust::device_vector<int> B(6);
B[0]=1;B[1]=2;B[2]=8;B[3]=11;B[4]=7;B[5]=4;

thrust::device_vector<int> C(A.size());

std::cout << "A="<< std::endl;
thrust::copy(A.begin(), A.end(), std::ostream_iterator<int>(std::cout, " "));
std::cout << std::endl;

std::cout << "B="<< std::endl;
thrust::copy(B.begin(), B.end(), std::ostream_iterator<int>(std::cout, " "));
std::cout << std::endl;

remove_common_elements(A, B, C);

std::cout << "C="<< std::endl;
thrust::copy(C.begin(), C.end(), std::ostream_iterator<int>(std::cout, " "));
std::cout << std::endl;

return 0;
}

Solution

In the general case (i.e. across various thrust algorithms) there is often no way to know the output size, except what the upper bound would be. The usual approach here would be to pass a result vector whose size is the upper bound of the possible output size. As you stated already, in many cases the actual size of the output cannot be known a-priori. Thrust has no particular magic to solve this. After the operation, you will know the size of the result, and it could be copied to a new vector if the "extra zeroes" were a problem for some reason (I can't think of a reason why they would be a problem generally, except that they use up allocated space).

If this is highly objectionable, one possibility (copying this information from a response by Jared Hoberock in another forum) is to run the algorithm twice, the first time using a discard_iterator (for the output data) and the second time with a real iterator, pointing to an actual vector allocation, of the requisite size. During the first pass, the discard_iterator is used to count the size of the actual result data, even though it is not stored anywhere. Quoting directly from Jared:

In the first phase, pass a discard_iterator as the output iterator. You can compare the discard_iterator returned as the result to compute the size of the output. In the second phase, call the algorithm "for real" and output into an array sized using the result of the first phase.

The technique is demonstrated in the set_operations.cu example [0,1]:

[0] https://github.com/thrust/thrust/blob/master/examples/set_operations.cu#L25

[1] https://github.com/thrust/thrust/blob/master/examples/set_operations.cu#L127