Search code examples
luadockerfilecudnn

Error when running torch prediction model on GPU


I have been trying to use a specific pretrained machine learning model for captioning pictures. I have been using https://github.com/unnonouno/densecap .

It comes with a Dockerfile setting up a whole cuda/torch/cudnn environement.

Predictions on a new picture are made by running the run_model.lua script. It does work when running it on the CPU by passing -gpu -1 but not when removing the arguement and running it on the GPU. I get the following error in that case:

THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-8398/cutorch/lib/THC/THCGeneral.c line=70 error=35 : CUDA driver version is insufficient for CUDA runtime version
/root/torch/install/bin/luajit: 
/root/torch/install/share/lua/5.1/trepl/init.lua:389: loop or previous error loading module 'cutorch'
stack traceback:
    [C]: in function 'error'
    /root/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
    ./densecap/utils.lua:26: in function 'setup_gpus'
    run_model.lua:145: in main chunk
    [C]: in function 'dofile'
    /root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x00406670

I have tried different things such as reinstalling cudnn by runnign luarocks install cudnn or downgrading from cudnn5 to cudnn4 without any success.


Solution

  • The issue appears to be with your CUDA driver:

    CUDA driver version is insufficient for CUDA runtime version

    Take a look at similar discussions here.

    No need to change your cuDNN version. You just need to rectify your CUDA driver/toolkit compatibility.