Running the CuSolverRf sample with the sample .mtx
files lap2D_5pt_n100.mtx
and lap3D_7pt_n20.mtx
allows the program to run smoothly. However, when I insert in my own .mtx
file, I get an error after step 8:
"CUDA error at cuSolverRF.ccp:649 code=2..."
I've narrowed down the problem to here:
checkCudaErrors(cusolverRfSetupHost(
rowsA, nnzA,
h_csrRowPtrA, h_csrColIndA, h_csrValA,
nnzL,
h_csrRowPtrL, h_csrColIndL, h_csrValL,
nnzU,
h_csrRowPtrU, h_csrColIndU, h_csrValU,
h_P,
h_Q,
cusolverRfH));
Which would jump to
void check(T result, char const *const func, const char *const file, int const line)
{
if (result)
{
fprintf(stderr, "CUDA error at %s:%d code=%d(%s) \"%s\" \n",
file, line, static_cast<unsigned int>(result), _cudaGetErrorEnum(result), func);
DEVICE_RESET
// Make sure we call CUDA Device Reset before exiting
exit(EXIT_FAILURE);
}
}
My question is how does the "result" derived? and what I can do to overcome the problem or what am I doing wrong?
Additional info: my matrix is 196530 by 196530 with 2530798 nnz.
The error code 2
corresponds to CUSOLVER_STATUS_ALLOC_FAILED
:
quoting the cuSOLVER documentation:
Resource allocation failed inside the cuSolver library. This is usually caused by a cudaMalloc() failure. To correct: prior to the function call, deallocate previously allocated memory as much as possible.
This means memory for your matrix could not be allocated, probably since your GPU's memory is exceeded. Try deallocating memory (as stated in the documentation), use a smaller input matrix, or use a GPU with more memory.