I have a linearized matrix stored as a thrust::device_vector<int>
Essentially, it is a matrix of dimension nc x nv stored in a linear array of this size.
I want to get the unique rows from this matrix. Two rows are unique if at least one element is different.
I want to use the CUDA thrust::sort
and thrust::unique
functions to do this. I believe I need to construct an iterator that corresponds to each row and then call sort with a functor that compares rows elementwise. But I am unclear how this will be done.
Using a strided range iterator will allow me to to specify the start of each row but the implementation of the functor is unclear.
This seems like a problem that should be solvable with thrust. Is there some better way?
I think your method is workable. Rather than sorting the matrix directly, I propose sorting a separate array of row indices such that the resulting row indices are sorted in the order of the sort order of the matrix rows.
We will create a sort functor that takes two row indices, and uses these to index into appropriate rows of the matrix. That sort functor will then order the two indicated rows using element-by-element comparison.
We will use a similar method (passing two row indices) for the "equality" functor passed to thrust::unique
. The equality functor will then test the two indicated rows for equality. I could have used a for-loop here as in the sort functor, testing for equality element-by-element, but instead chose to use a nested thrust::mismatch
algorithm, for variety.
Here is a worked example:
$ cat t1033.cu
#include <thrust/device_vector.h>
#include <thrust/sort.h>
#include <thrust/unique.h>
#include <thrust/sequence.h>
#include <assert.h>
#include <iostream>
#include <thrust/execution_policy.h>
#include <thrust/mismatch.h>
typedef int mytype;
struct my_sort_func
{
int cols;
mytype *data;
my_sort_func(int _cols, mytype *_data) : cols(_cols),data(_data) {};
__host__ __device__
bool operator()(int r1, int r2){
for (int i = 0; i < cols; i++){
if (data[cols*r1+i] < data[cols*r2+i])
return true;
else if (data[cols*r1+i] > data[cols*r2+i])
return false;}
return false;
}
};
struct my_unique_func
{
int cols;
mytype *data;
my_unique_func(int _cols, mytype *_data) : cols(_cols),data(_data) {};
__device__
bool operator()(int r1, int r2){
thrust::pair<mytype *, mytype *> res = thrust::mismatch(thrust::seq, data+(r1*cols), data+(r1*cols)+cols, data+(r2*cols));
return (res.first == data+(r1*cols)+cols);
}
};
int main(){
const int ncols = 3;
mytype data[] = { 1, 2, 3, 1, 2, 3, 1, 3, 5, 2, 3, 4, 1, 2, 3, 1, 3, 5};
size_t dsize = sizeof(data)/sizeof(mytype);
assert ((dsize % ncols) == 0);
int nrows = dsize/ncols;
thrust::device_vector<mytype> d_data(data, data+dsize);
thrust::device_vector<int> rowidx(nrows); // reference rows by their index
thrust::sequence(rowidx.begin(), rowidx.end());
thrust::sort(rowidx.begin(), rowidx.end(), my_sort_func(ncols, thrust::raw_pointer_cast(d_data.data())));
int rsize = thrust::unique(rowidx.begin(), rowidx.end(), my_unique_func(ncols, thrust::raw_pointer_cast(d_data.data()))) - rowidx.begin();
thrust::host_vector<int> h_rowidx = rowidx;
std::cout << "Unique rows: " << std::endl;
for (int i = 0; i < rsize; i++){
for (int j = 0; j < ncols; j++) std::cout << data[h_rowidx[i]*ncols+j] << ",";
std::cout << std::endl;}
return 0;
}
$ nvcc -o t1033 t1033.cu
$ ./t1033
Unique rows:
1,2,3,
1,3,5,
2,3,4,
$
Notes:
I suspect overall performance would improve if the input matrix were transposed, and we were comparing columns (in the transposed matrix) instead of rows. It may provide some benefit for the sort operation, I suspect it may also provide some benefit for the unique operation. The given code however matches your description in the question, and it should be a pretty good roadmap for how to do it in the column case, although it would have to be refactored for that.
This method does not actually re-order the matrix rows. For efficiency I wanted to avoid doing lots of data-movement, since the problem statement didn't seem to depend on it. If you did actually want an intermediate data set that had the matrix rows in sorted order, I would still suggest doing the above sort operation, and then use the results to re-order the matrix in a single operation, using one of two possible methods: either a scatter/gather operation, or a thrust::permuation_iterator
combined with a thrust::copy
operation.
With a bit of cleverness, a nested thrust::mismatch
operation could have been used in the sort functor as well, in place of the for-loop.