Search code examples
cgcccudacompiler-optimizationnvcc

CUDA - nvcc -G - if not working properly


I'm currently working on porting a lava flow model in CUDA (full code on github here: Full source of the CUDA-SCIARA Fv2 lava flow model .

EDIT

In order to reproduce the issue, use the instruction on github README in order to get a dev copy of the project SCIARA_FV2_CUDA_MULTICELLS. Then compile withouth -G option and passing as command line argument -c ../data/2006/PARAMETERS.cfg . (quoted code is at line 260 in this file )

I'm facing a problem with an if construct inside a __device__ function.

Based on the current lava quantity and temperature it computes the new temperature and if it is lower than a constant parameter (the variable d_PTsol=1143.0) the lava is solified.

The problem in the code below is that is works perfectly if I compile with the -G options (for the generation of device code debug infos) but behave wrongly without.

double new_temp = d_computeNewTemperature(sommah,sommath);        
if(new_temp <= d_PTsol){
            printf("Solidified %.5f,%.5f\n",new_temp,d_PTsol);
            double newQuote = d_sbts_updated[d_getIdx(row,col,ALTITUDE)]+d_sbts_current[d_getIdx(row,col,THICKNESS)];
            //CODE FOR LAVA SOLIDIFICATION HERE
    }else{
           //there is lava and is not solidified -> activate this cell!
           adjustAdaptiveGrid(row,col);
 }

ouptutting something like this at a certain point of the simulation:

Solidified 1344.68654 1143.00000
Solidified 1343.99509 1143.00000
Solidified 1320.50061 1143.00000
Solidified 1325.53942 1143.00000

To make things more subtle the problem completly disappear if I change the the if condition to a strict inequality if(new_temp < d_PTsol).

Compilation is carried out with the following options and in separate compilation mode

-O3 -Xcompiler -fPIC -std=c++11

and linking using

--cudart static --relocatable-device-code=true -gencode arch=compute_35,code=compute_35 -gencode arch=compute_35,code=sm_35

Has someone faced a similar issue before? Am I doing something wrong?

UPDATE

The problem seem to be somehow related to the translation of the if else construct with the <= as condition. Translating

if(new_temp <= d_PTsol) {
        //solidification
}else{
        //something else
}

to

if(new_temp <= d_PTsol) {
        //solidification
}
if(!(new_temp <= d_PTsol)){
        //something else
}

makes the code work perfectly.


Solution

  • I had an similar issue. My code worked with -G option but not without. I just needed to add -fmad=false and -prec-div=false to the compiler options (for more information see: http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#options-for-steering-gpu-code-generation).

    By default this values are true and optimise your single precision operations, but at the cost of your precision. If you enable debug mode this optimisation will be disabled. I needed exact values, therefore my code worked only with -G. Maybe you have a similar issue. (more information about CUDA floating point precision: http://docs.nvidia.com/cuda/cuda-c-programming-guide/#mathematical-functions-appendix)