What is the API for writing a custom (Elementwise or Raw) kernel that works on cuSPARSE instances in CuPy? If I want to write a kernel that can take cupyx.scipy.sparse.csr_matrix
instances as argument, what arguments does the underlying CUDA code need to take?
Found it (at least approximately). For a CSR sparse matrix s
, the fields s.data
, s.indices
, and s.indptr
contain a dense array of length s.nnz
of nonzero entries, a dense array of length s.nnz
of column indices, and a list of locations (relative addresses) of the beginnings of each row in indices
respectively. These are all normal cupy.ndarray
s and can be passed into a kernel and then to cuSPARSE functions as normal.