I am currently using pycuda and scikits.cuda to solve linear equation A*x = b, where A is an upper/lower matrix. However the cublasStbsv routine requires a specific format. To give an example: if a lower matrix A = [[1, 0, 0], [2, 3, 0], [4, 5, 6]], then the input required by cublasStbsv should be [[1, 3, 6], [2, 5, 0], [4, 0, 0]], where rows are diagonal, subdiagonal1, subdiagonal2, respectively. If using numpy, this can be easily done by stride_tricks.as_strided, but I dont know how to do similar things with pycuda.gpuarray. Any help would be appreciated, thanks. I found pycuda.compyte.array.as_strided, but it cannot be applied to gpuarray.
I got it done by using theano. First converted it to cudandarray, change stride and make a copy back to gpuarray. Just be careful about changes between Fortran and C order. update: finally got it done by using gpuarray.multi_take_put
def make_triangle(s_matrix, uplo = 'L'):
"""convert triangle matrix to the specific format
required by cublasStbsv, matrix should be in Fortran order,
s_matrix: gpuarray
"""
#make sure the dytpe is float32
if s_matrix.dtype != 'f':
s_matrix = s_matrix.astype('f')
dim = s_matrix.shape[0]
if uplo == 'L':
idx_tuple = np.tril_indices(dim)
gidx = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * dim)
gdst = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * (dim - 1))
return gpuarray.multi_take_put([s_matrix], gdst, gidx, (dim, dim))[0]
else:
idx_tuple = np.triu_indices(dim)
gidx = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * dim)
gdst = gpuarray.to_gpu(idx_tuple[0] + (idx_tuple[1] + 1) * (dim - 1))
return gpuarray.multi_take_put([s_matrix], gdst, gidx, (dim, dim))[0]