A kernel with a shared array and a couple of local ints:
__global__ void myKern()
{
gloablID = ....; //initialize gloabl thread ID
__shared__ int TMS[3]; //populate shared array in a simple way
if (globalID == 0)
{
TMS[0] = 0;
TMS[1] = 1;
TMS[2] = 2;
}
__syncthreads();
int val0 = 69;
int val1 = 36;
int val2 = 92;
int random_number = .... //use cuRand to get a random number between 0 and 3
int output = TMS[random_number];
//at this point, I want the variable "output" to be used to access one of my local ints.
//For example, if "output" = 2, I want to be able to print val2 to screen.
//In a fantasy computer language this might look something like:
//std::cout<< "val" + "output";
//I just want 92 to be printed to the screen.
???
}
This may seem like an odd algorithm but if I can do this, it will allow me to combine the speed of registers with the large size of the shared cache in my CUDA project. Please no bruteforce binary solutions since I will be using a shared array of size 2698 with 33 local variables.
You can use the following:
int vals[] = { 69, 36, 92 };
int random_number = ....;
int output = TMS[random_number];
int chosen = vals[output];
and this assumes the random number is between 0 and 2