I try to read values from a texture and write them back to global memory. I am sure the writing part works, beause I can put constant values in the kernel and I can see them in the output:
__global__ void
bartureKernel( float* g_odata, int width, int height)
{
unsigned int x = blockIdx.x*blockDim.x + threadIdx.x;
unsigned int y = blockIdx.y*blockDim.y + threadIdx.y;
if(x < width && y < height) {
unsigned int idx = (y*width + x);
g_odata[idx] = tex2D(texGrad, (float)x, (float)y).x;
}
}
The texture I want to use is a 2D float texture with two channels, so I defined it as:
texture<float2, 2, cudaReadModeElementType> texGrad;
And the code which calls the kernel initializes the texture with some constant non-zero values:
float* d_data_grad = NULL;
cudaMalloc((void**) &d_data_grad, gradientSize * sizeof(float));
CHECK_CUDA_ERROR;
texGrad.addressMode[0] = cudaAddressModeClamp;
texGrad.addressMode[1] = cudaAddressModeClamp;
texGrad.filterMode = cudaFilterModeLinear;
texGrad.normalized = false;
cudaMemset(d_data_grad, 50, gradientSize * sizeof(float));
CHECK_CUDA_ERROR;
cudaBindTexture(NULL, texGrad, d_data_grad, cudaCreateChannelDesc<float2>(), gradientSize * sizeof(float));
float* d_data_barture = NULL;
cudaMalloc((void**) &d_data_barture, outputSize * sizeof(float));
CHECK_CUDA_ERROR;
dim3 dimBlock(8, 8, 1);
dim3 dimGrid( ((width-1) / dimBlock.x)+1, ((height-1) / dimBlock.y)+1, 1);
bartureKernel<<< dimGrid, dimBlock, 0 >>>( d_data_barture, width, height);
I know, setting the texture bytes to all "50" doesn't make much sense in the context of floats, but it should at least give me some non-zero values to read.
I can only read zeros though...
You are using cudaBindTexture
to bind your texture to the memory allocated by cudaMalloc
. In the kernel you are using tex2D
function to read values from the texture. That is why it is reading zeros.
If you bind texture to linear memory using cudaBindTexture
, it is read using tex1Dfetch
inside the kernel.
tex2D
is used to read only from those textures which are bound to pitch linear memory ( which is allocated by cudaMallocPitch
) using the function cudaBindTexture2D
, or those textures which are bound to cudaArray using the function cudaBindTextureToArray
Here is the basic table, rest you can read from the programming guide:
Memory Type | Allocated Using | Bound Using | Read In The Kernel By |
---|---|---|---|
Linear Memory | cudaMalloc |
cudaBindTexture |
tex1Dfetch |
Pitch Linear Memory | cudaMallocPitch |
cudaBindTexture2D |
tex2D |
cudaArray |
cudaMallocArray |
cudaBindTextureToArray |
tex1D or tex2D |
3D cudaArray |
cudaMalloc3DArray |
cudaBindTextureToArray |
tex3D |