We are trying to use some of the cooperative groups features in CUDA to write a small application. We are using Tesla V100 card with CUDA 11.0. But on using is_valid() method in thread_block the below error is observed:
error: class "cooperative_groups::__v1::thread_block" has no member "is_valid"
In the CUDA provided sample simpleCooperativeGroups.cu, in the kernel cgkernel() when the is_valid method is used, this error can be seen. The code snippet from the modified sample looks as below.
__global__ void cgkernel(){
// threadBlockGroup includes all threads in the block
thread_block threadBlockGroup = this_thread_block();
int threadBlockGroupSize=threadBlockGroup.size();
// workspace array in shared memory required for reduction
extern __shared__ int workspace[];
int input, output, expectedOutput;
// input to reduction, for each thread, is its' rank in the group
input=threadBlockGroup.thread_rank();
// expected output from analytical formula (n-1)(n)/2
// (noting that indexing starts at 0 rather than 1)
expectedOutput=(threadBlockGroupSize-1)*threadBlockGroupSize/2;
// perform reduction
output=sumReduction(threadBlockGroup, workspace, input);
bool valid = threadBlockGroup.is_valid();
.
.
.
}
Any suggestions to resolve this would be of great help.
studying cooperative_groups.h
, it appears that the only cg classes for which is_valid()
method is provided are grid_group
and multi_grid_group
.
Therefore at this time, these are the only groups for which the method is available, you should not attempt to use that method with other group types; I think the best assumption is that other group types are always considered valid.
I suspect the logic here is that grid and multi-grid groups have proper launch configuration and platform requirements; they can be invalid. The other group types cannot be invalid (at least in those respects) to create on any supported platform or launch configuration. I don't intend that as a bulletproof statement under any possible interpretation, but a general guide or reasoning.