Search code examples
opencvubuntucudanvcc

Installing OpenCV3.1 with CUDA7.5 & Ubuntu 16.04 has odd error


I've just started a clean install of Ubuntu 16.04 with CUDA 7.5 and had problems installing OpenCV3.1.

When I run make, I got the following error:

[ 9%] Building NVCC (Device) object modules/core/CMakeFiles/cuda_compile.dir/src/cuda/cuda_compile_generated_gpu_mat.cu.o
/usr/include/string.h: In function ‘void* __mempcpy_inline(void, const void, size_t)’:
/usr/include/string.h:652:42: error: ‘memcpy’ was not declared in this scope
return (char *) memcpy (__dest, __src, __n) + __n;

I found a solution in various (closed) github bug/problem discussion threads, which is as follows:

 In opencv/cmake/OpencvDetectCuda.cmake, change

    set(NVCC_FLAGS_EXTRA ${NVCC_FLAGS_EXTRA} -gencode arch=compute_${CMAKE_MATCH_2},code=sm_${CMAKE_MATCH_1})

 to

    set(NVCC_FLAGS_EXTRA ${NVCC_FLAGS_EXTRA} -D_FORCE_INLINES -gencode arch=compute_${CMAKE_MATCH_2},code=sm_${CMAKE_MATCH_1})

This solution worked for me, but I still don't understand the original problem or solution. Why does adding the flag -D_FORCE_INLINE fix things? Why is there a problem with string.h? This is (I think) one of the more stable files being compiled. I would've expected any errors to be associated with CUDA7.5 or OpenCV3.1?

If I see this issue again how do I recognize it?


Solution

  • Apparently, /usr/include/string.h changed from glib2.22 to glibc2.23 (https://fossies.org/diffs/glibc/2.22_vs_2.23/string/string.h-diff.html). The added code comes at the bottom of the file and is:

    #if defined __USE_GNU && defined __OPTIMIZE__ \
            && defined __extern_always_inline && __GNUC_PREREQ (3,2)
        # if !defined _FORCE_INLINES && !defined _HAVE_STRING_ARCH_mempcpy
    
        #undef mempcpy
        #undef __mempcpy
        #define mempcpy(dest, src, n) __mempcpy_inline (dest, src, n)
        #define __mempcpy(dest, src, n) __mempcpy_inline (dest, src, n)
    
        __extern_always_inline void *
        __mempcpy_inline (void *__restrict __dest,
                         const void *__restrict __src, size_t __n)
        {
          return (char *) memcpy (__dest, __src, __n) + __n;
        }
    
        # endif
        #endif
    

    The ways I've seen to stop this new code from triggering the memcpy error are:

    1 Just comment out this code

    2 Add D_FORCE_INLINES to flags for NVCC

    (https://github.com/opencv/opencv/issues/6500
       Simple replace in opencv/cmake/OpencvDetectCuda.cmake
    
       set(NVCC_FLAGS_EXTRA ${NVCC_FLAGS_EXTRA} -gencode arch=compute_${CMAKE_MATCH_2},code=sm_${CMAKE_MATCH_1})
    
    to
    
       set(NVCC_FLAGS_EXTRA ${NVCC_FLAGS_EXTRA} -D_FORCE_INLINES -gencode arch=compute_${CMAKE_MATCH_2},code=sm_${CMAKE_MATCH_1})
    

    or, for similar errors, adding D_FORCE_INLINES to ccflags for cc (but I can't find the reference now)

    Now, I'm trying to figure out what this code does....