Search code examples
cudacublas

Should we reuse the cublasHandle_t across different calls?


I'm using the latest version CUDA 5.5 and the new CUBLAS has a stateful taste where every function needs a cublasHandle_t e.g.

  cublasHandle_t handle;
  cublasCreate_v2(&handle);
  cublasDgemm_v2(handle, A_trans, B_trans, m, n, k, &alpha, d_A, lda, d_B, ldb, &beta, d_C, ldc);
  cublasDestroy_v2(handle);

Is it a good practice to reuse this handle instance as much as possible like some sort of a Session or the performance impact would be so small that it makes more sense to lower code complexity by having short-living handle instances and therefore create/destroy it continuously?


Solution

  • I think it is a good practice for two reasons:

    1. From the cuBLAS Library User Guide, "cublasCreate() [...] allocates hardware resources on the host", which makes me think that there is some overhead on its call.
    2. Multiple cuBLAS handle creation/destruction can break concurrency by unneeded context synchronizations.