I am having some trouble with creating a functor properly to access a device vector. Basically, I have two device vectors that I would like to use inside a functor. The functor is called during a for_each.
Here is my functor:
struct likelihood_functor
{
int N;
float* v1;
float* v2;
likelihood_functor(int _N, float* _v1, float* _v2) : N(_N),v1(_v1),v2(_v2) {}
template <typename Tuple>
__host__ __device__ void operator()(Tuple t)
{
float A = thrust::get<0>(t);
float rho = thrust::get<1>(t);
float mux = thrust::get<2>(t);
float muy = thrust::get<3>(t);
float sigx = thrust::get<4>(t);
float sigy = thrust::get<5>(t);
thrust::device_ptr<float> v1_p(v1);
thrust::device_vector<float> X(v1_p,v1_p+N);
thrust::device_ptr<float> v2_p(v2);
thrust::device_vector<float> Y(v2_p,v2_p+N);
thrust::get<6>(t) = 600*logf(A)
- 600/2*logf(sigx*sigx*sigy*sigy*(1-rho*rho))
- thrust::reduce(X.begin(),X.end())
- thrust::reduce(Y.begin(),Y.end())
- 2*rho/(sigx*sigy);
}
};
And here is my main():
int main(void)
{
// create a 2D dataset
const int N=2500; //number of counts
thrust::device_vector<float> data_x(N);
thrust::device_vector<float> data_y(N);
thrust::counting_iterator<unsigned int> begin(0);
thrust::transform(begin,
begin + N,
data_x.begin(),
get_normal(5.f,1.f,2.f));
thrust::transform(begin,
begin + N,
data_y.begin(),
get_normal(5.f,1.f,2.f));
//
// Some code here to initiate A_n, rho_na, mux_n etc...
//
// apply the transformation
thrust::for_each(
thrust::make_zip_iterator(
thrust::make_tuple(A_n.begin(), rho_n.begin(), mux_n.begin(), muy_n.begin(), sigx_n.begin(),sigy_n.begin(), L.begin())
),
thrust::make_zip_iterator(
thrust::make_tuple(A_n.end(), rho_n.end(), mux_n.end(), muy_n.end(), sigx_n.end(),sigy_n.end(),L.end())
),
likelihood_functor(N,thrust::raw_pointer_cast(&(data_x[0])),thrust::raw_pointer_cast(&(data_y[0])))
);
// print the output
for(int i=0; i<4096; i++)
{
std::cout << "[" << i << "] : " << L[i] <<std::endl;
}
}
The code compile, but it does not run. I know it is because in my functor, the device_vector X and Y are not done properly.
I have used the same code to create X and Y in my main function, and when I do this, the program runs fine (in this case, I do not call the functor). What is different from inside a functor that would make something work in the main program and not in the functor?
Is there another way of doing what I am trying to do?
Thank you for the help!
EDIT: The answer below is no longer correct. Thrust algorithms are usable in a variety of ways in CUDA device code, some of which are covered here. thrust::device_vector
is still generally not usable in CUDA device code (although the underlying data is).
Thrust algorithms (e.g. transformations, reductions, etc.) cannot be used in cuda device code.
That is any function preceded by __global__
or __device__
cannot use thrust (e.g. cannot use for example thrust::reduce
).
Therefore your functor will not work in device code, since it uses thrust constructs (such as thrust::reduce
).
I realize you say that your code compiles, but I'm not really sure I believe that. If I try to declare a thrust::device_vector
inside __device__
code, I get compile errors. Since the code you've shown here is incomplete in many ways, I'm not able to easily demonstrate that with your code, due to other problems with what you've posted.