cublasSgetriBatched compilation error with CUDA 7.0 Release Candidate

Consider the code posted by sgarizvi at

CUBLAS: Incorrect inversion for matrix with zero pivot

I'm using that code as an off-the-shelf reproducer of my problem.

If I compile it with CUDA 6.0, everything works fine. Opposite to that, if I compile it with CUDA 6.5 or CUDA 7.0 Release Candidate I receive:

Error   13  error C2664: 'cublasSgetriBatched' : cannot convert parameter 3 from 'float **' to 'const float *[]'    C:\Users\user\Documents\Project\StackOverflow15\StackOverflow15\kernel.cu   70  1   StackOverflow15

Is it a bug or I'm doing anything bad?

My configuration: Windows 7, Microsoft Visual Studio 2010, Release Mode, x64, compute_20,sm_21.

EDIT

Following Robert Crovella's answer and Park Young-Bae's comment, the pointed example can be fixed to work with CUDA 6.5 or 7.0 by changing the line

cublascall(cublasSgetriBatched(handle,n,A_d,lda,P,C_d,lda,INFO,batchSize));

cublascall(cublasSgetriBatched(handle,n,(const float **)A_d,lda,P,C_d,lda,INFO,batchSize));

Solution

I haven't tried on Windows, but I observe on Linux that the compile error occurs whether I use CUDA 6.5 or CUDA 7 RC. However if I go back to CUDA 6.0 (which the linked previous question primarily has in view), then the compile error goes away.

There was a change in the CUBLAS API in this respect, in particular for the getriBatched function prototype in cublas_api.h:

CUDA 6.0:

/* Batched inversion based on LU factorization from getrf */
CUBLASAPI cublasStatus_t CUBLASWINAPI cublasSgetriBatched(cublasHandle_t handle,
                                                  int n,
                                                  float *A[],                     /*Device pointer*/
                                                  int lda,
                                                  int *P,                         /*Device pointer*/
                                                  float *C[],                     /*Device pointer*/
                                                  int ldc,
                                                  int *INFO,
                                                  int batchSize);

CUDA 6.5/7RC:

/* Batched inversion based on LU factorization from getrf */
CUBLASAPI cublasStatus_t CUBLASWINAPI cublasSgetriBatched(cublasHandle_t handle,
                                                  int n,
                                                  const float *A[],               /*Device pointer*/
                                                  int lda,
                                                  const int *P,                   /*Device pointer*/
                                                  float *C[],                     /*Device pointer*/
                                                  int ldc,
                                                  int *info,
                                                  int batchSize);

Note the addition of the const qualifier on the 3rd parameter. This is fundamentally what is giving rise to the observed difference. And as for the actual error itself, this is correct per C++ rules, as indicated by @ParkYoungBae in the comments.

The original code in the previous linked question should be modified for use with newer CUBLAS API headers.