Search code examples
c++cc++11memory-managementbazel

Segmentation fault when reading variables in pointed function


I have the following pieces of code: (Note: the codes are simplified and changed a lot as I can't share the data)

exercise.cpp

#include "some_api.h"

extern "C" int extension_init(func_t*, VExtension*);

int main(int argc, char** argv)
{
    VExtension ve;
    extension_init(&func, &ve);
    return 0;
} 

some_api.h

bool func(int const& a, void* const& b, VExtension* const& v)
{
    std::cout << a << b << std::endl;
}

api.h

typedef int (func_t)(int c, void* p, VExtension* v)

file.cpp

#include "api.h" // this is included implicitly

extern "C" int extension_init(func_t* F, VExtension* v)
{
    intptr_t ver = 7;
    F(1, (void*)ver, v);
}

So, when F is called func is called from some_api.h, but Seg Fault appears when trying to output the values a and b. Static analyzer gives the following error message:

ASAN:DEADLYSIGNAL
=================================================================
==15==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000001 (pc 0x55d6dbe48ca2 bp 0x7fff79dbf320 sp 0x7fff79dbf300 T0)
==15==The signal is caused by a READ memory access.
==15==Hint: address points to the zero page.
    #0 0x55d6dbe48ca1 in func(int const&, void* const&, VExtension* const&) some_api.h:279
    #1 0x55d6dbefe697 in file.cpp:809
    #2 0x55d6dbe5373a in main exercise.cpp:123
    #3 0x7f9c65393b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #4 0x55d6dbd49839 in _start (//...)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: some_api.h:279 in func(int const&, void* const&, VExtension* const&)
==15==ABORTING

Solution

  • Corrected code could work

    If func_t would be defined using the same signature as func(), everything would be fine, since the arbitrary value of b is never dereferenced:

    typedef bool (func_t)(int const& a, void* const& b, VExtension* const& v);
    

    See demo.

    But your code can't, so there's something fishy going on here

    Your code seems to compile, at least judging from the runtime trace you've displayed. But according to the snippets you've posted, the code should not compile at all. So I suspect that your different headers managed to create a dangerous mix, using different definitions from func_t in different compilation units. If this is the case, you would be in the world of undefined behavior.

    For instance the return type int doesn't match return type bool. But much more fatally, in func_t the parameters are plain values, whereas in func() they are all references. So if the compiler was tricked, the calling convention used by the caller of func() would be incompatible with the calling conventions used by func(), thus causing invalid memory accesses (most probably --but this woud be implementation dependent-- the generated calling code would send value arguments, but the generated receiving code would expect these parameters to be pointers to values and try to access the pointed memory locations, which would be invalid).

    Additional remark

    Now, I'm not a language lawyer, but it's also interesting to see a function using a C calling convention (extern "C"), using parameter and return types that do not exist in C (i.e. bool and references).

    Additional insights

    Looking at the assembly code generated for extension_init on one compiler using the corrected definition of func_t:

    lea rdx, [rsp+8]           ; load effective address of third paramvalue
    lea rsi, [rsp+24]          ; load effective address of second param value
    mov QWORD PTR [rsp+24], 7  ; initialize value 
    lea rdi, [rsp+20]          ; load effective address of first param value
    mov DWORD PTR [rsp+20], 1  ; initilize value
    call rax
    

    If using your original definition, the code generated looks differently:

                    ; the third value was left out by the global initializer
                    ;  but it was previously set with an LEA to load the effective 
                    ; address of the struct. 
    mov esi, 7      ; pass the second value
    mov edi, 1      ; pass the first value
    call rax