cuda global pointer allocation in different source file

I faced a situation that I need some tables to be filled in one source file (for example fill.cu) and then be used in different kernels in different source files.

I tried declaring a pointer __device__ float *myTable; as 'extern' in fill.h header file and adding that to others.cpp and defining that pointer in fill.cu and allocate and fill it there.
This way, I got linker error indicating that myTable has been already defined in fill.cpp.

After many unsuccessful try, I decided to put all kernels that need this table in same source file, this way everything works fine until I added an cudaMalloc in main function before allocating my table in fill.cpp.
This way I noticed that table values and data allocated in main are overlapped and using cuda debugging tools of MS visual studio 2015, I found that 2 allocated pointer are same!!!

Please advice how to declare a global pointer in cuda without conflict.

Solution

The traditional CUDA linkage model requires that all device symbols, textures, functions, etc. are defined and used within the scope of the same translation unit. It sounds like your code structure is violating this requirement.

You have two choices:

Continue to the same code structure, but provide wrapper functions which your main can call to perform operations on statically declared device variables, rather than directly manipulating device symbols with the CUDA API from other code.
Use separate compilation. Here, you define the device symbol you want to access in exactly one file and declare the same symbol as externeverywhere else you need to use that symbol. You must explicitly use several nvcc options to compile your device code and use a separate device code linking stage.

Both approaches are well documented.