import pycuda.driver as cuda
import pycuda.autoinit
from pycuda.compiler import SourceModule
import numpy as np
dims=img_in.shape
rows=dims[0]
columns=dims[1]
channels=dims[2]
#To be used in CUDA Device
N=columns
#create output image matrix
img_out=np.zeros([rows,cols,channels])
#Convert img_in pixels to 8-bit int
img_in=img_in.astype(np.int8)
img_out=img_out.astype(np.int8)
#Allocate memory for input image,output image and N
img_in_gpu = cuda.mem_alloc(img_in.size * img_in.dtype.itemsize)
img_out_gpu= cuda.mem_alloc(img_out.size * img_out.dtype.itemsize)
N=cuda.mem_alloc(N.size*N.dtype.itemsize)
#Transfer both input and now empty(output) image matrices from host to device
cuda.memcpy_htod(img_in_gpu, img_in)
cuda.memcpy_htod(img_out_gpu, img_out)
cuda.memcpy_htod(N_out_gpu, N)
#CUDA Device
mod=SourceModule("""
__global__ void ArCatMap(int *img_in,int *img_out,int *N)
{
int col = threadIdx.x + blockIdx.x * blockDim.x;
int row = threadIdx.y + blockIdx.y * blockDim.y;
int img_out_index=col + row * N;
int i=(row+col)%N;
int j=(row+2*col)%N;
img_out[img_out_index]=img_in[]
}""")
func = mod.get_function("ArCatMap")
#for i in range(1,385):
func(out_gpu, block=(4,4,1))
cuda_memcpy_dtoh(img_out,img_in)
cv2_imshow(img_out)
What I have here is a 512 X 512 image. I am trying to convert all the elements of the input image img_in to 8 bit int using numpy.astype. The same is being done for the output image matrix img_out. When I try to use cuda.mem_alloc(), I get an error saying that 'type int has no attribute called size' and 'type int has no attribute called dtype'. Also, I get an error called 'int has no attribute called astype'. Could you state any possible causes ?
You are getting a python error. You defined N
as N=dims[1]
so its just a single value integer. You can not call the function size on integers, as well, they are of size 1. Similarly, you can not check which type an int is, because well, its an int. You are doing that in the call to cuda.mem_alloc
.
You dont need to allocate memory for a single int, you can just pass it by value. Define the kernel as __global__ void ArCatMap(int *img_in,int *img_out,int N)
instead of passing a pointer.