Search code examples
cudapycudacublasscikits

How to convert an upper/lower gpuarray to the specific format required by cublasStbsv?


I am currently using pycuda and scikits.cuda to solve linear equation A*x = b, where A is an upper/lower matrix. However the cublasStbsv routine requires a specific format. To give an example: if a lower matrix A = [[1, 0, 0], [2, 3, 0], [4, 5, 6]], then the input required by cublasStbsv should be [[1, 3, 6], [2, 5, 0], [4, 0, 0]], where rows are diagonal, subdiagonal1, subdiagonal2, respectively. If using numpy, this can be easily done by stride_tricks.as_strided, but I dont know how to do similar things with pycuda.gpuarray. Any help would be appreciated, thanks. I found pycuda.compyte.array.as_strided, but it cannot be applied to gpuarray.


Solution

  • I got it done by using theano. First converted it to cudandarray, change stride and make a copy back to gpuarray. Just be careful about changes between Fortran and C order. update: finally got it done by using gpuarray.multi_take_put

    def make_triangle(s_matrix, uplo = 'L'):
    """convert triangle matrix to the specific format
    required by cublasStbsv, matrix should be in Fortran order,
    s_matrix: gpuarray    
    """
    #make sure the dytpe is float32     
    if s_matrix.dtype != 'f':
        s_matrix = s_matrix.astype('f')
    dim = s_matrix.shape[0]
    if uplo == 'L':
        idx_tuple = np.tril_indices(dim)
        gidx = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * dim)
        gdst = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * (dim - 1))
        return gpuarray.multi_take_put([s_matrix], gdst, gidx, (dim, dim))[0]
    else:
        idx_tuple = np.triu_indices(dim)
        gidx = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * dim)
        gdst = gpuarray.to_gpu(idx_tuple[0] + (idx_tuple[1] + 1) * (dim - 1))
        return gpuarray.multi_take_put([s_matrix], gdst, gidx, (dim, dim))[0]