macro parameter won't take argument passed (nvcc)

I'm just starting to code on CUDA and I'm trying to manage my codes into a bunch of different files, but one of my macros won't take the argument passed for some reason.

The error is:

addkernel.cu(19): error: identifier "err" is undefined

so my main code is in ../cbe4/addkernel.cu

#include <stdio.h>
#include <stdlib.h>

#include "cbe4.h"
#include "../mycommon/general.h"

#define N 100

int main( int argc, char ** argv ){

        float h_out[N], h_a[N], h_b[N]; 
        float *d_out, *d_a, *d_b; 

        for (int i=0; i<N; i++) {
                h_a[i] = i + 5;
                h_b[i] = i - 10;
        }

        // The error is on the next line
        CUDA_ERROR( cudaMalloc( (void **) &d_out, sizeof(float) * N ) );
        CUDA_ERROR( cudaMalloc( (void **) &d_a, sizeof(float) * N ) ); 
        CUDA_ERROR( cudaMalloc( (void **) &d_b, sizeof(float) * N ) );

        cudaFree(d_a);
        cudaFree(d_b);


        return EXIT_SUCCESS;
}

The macro is defined in ../mycommon/general.h:

#ifndef __GENERAL_H__
#define __GENERAL_H__

#include <stdio.h>

// error checking 
void CudaErrorCheck (cudaError_t err, const char *file, int line);

#define CUDA_ERROR ( err ) (CudaErrorCheck( err, __FILE__, __LINE__ )) 

#endif

and this is the source code for the function CudaErrorCheck in ../mycommon/general.cu:

#include <stdio.h>
#include <stdlib.h>

#include "general.h"

void CudaErrorCheck (cudaError_t err,
                        const char *file,
                        int line) {
        if ( err != cudaSuccess ) {
                printf( "%s in %s at line %d \n",
                        cudaGetErrorString( err ),
                        file, line );
                exit( EXIT_FAILURE );
        }
}

../cbe/cbe4.h is my header file and ../cbe/cbe4.cu the source file for kernel codes (in case this might help):

in cbe4.h:

__global__
void add( float *, float *, float * );

in cbe4.cu:

    #include "cbe4.h"

__global__ void add( float *d_out, float *d_a, float *d_b ) {
        int tid = (blockIdx.x * blockDim.x) + threadIdx.x;
        d_out[tid] = d_a[tid] + d_b[tid]; }

and here's my makefile (stored in ../cbe4):

NVCC = nvcc
SRCS = addkernel.cu cbe4.cu
HSCS = ../mycommon/general.cu

addkernel:  
        $(NVCC) $(SRCS) $(HSCS) -o $@

Also, I'm using the Cuda by Example book, by the way. One thing about the code in common/book.h, the function for HandleError ( I renamed it CudaErrorCheck and placed it in another source code here ) was defined in the header file (equivalently, at the CudaErrorCheck declaration in my general.h . Isn't this inadvisable? Or so I heard. )

Solution

Spacing matters in macro definitions. You have:

#define CUDA_ERROR ( err ) (CudaErrorCheck( err, __FILE__, __LINE__ ))

You need (minimal change — delete one space):

#define CUDA_ERROR( err ) (CudaErrorCheck( err, __FILE__, __LINE__ ))

With a function-like macro, there cannot be white space between the macro name and the open parenthesis of the argument list of the macro definition. When it comes to using the macro, white space is allowed between the macro name and the open parenthesis of the argument list.

I'd write:

#define CUDA_ERROR(err) CudaErrorCheck(err, __FILE__, __LINE__)

The extra parentheses around the whole expansion aren't really necessary, and I don't much like white space around parentheses. Different people have different views on this, so I'm stating my preference without in any sense demanding that you use it (but obviously suggesting that you consider it).

Because of the space, your code was expanding to look like:

( err ) (CudaErrorCheck( err, "addkernel.cu", 19 ))( cudaMalloc( (void **) &d_out, sizeof(float) * N ) );

and err was diagnosed as an undefined identifier, making the cast invalid.