Search code examples
c++clinuxdlopendlsym

Calling dlsym() with a NULL handle doesn't return NULL, but rather returns a random function


My title may not be clear, so allow me to explain. I have a piece of code that goes like this:

void* pluginFile = dlopen(fileName, RTLD_LAZY);
auto function = dlsym(pluginFile, "ExpectedFunction");

This works fine if dlopen returns the right file. My problem is when dlopen doesn't find a file and returns NULL. What currently happens is that this call is made:

dlsym(0x0, "ExpectedFunction");

The problem is that this returns a random function in my project called ExpectedFunction. What I thought would happen is that dlsym would return NULL since the passed handle is NULL. I'm not able to find the expected behavior for such a use case online.

My question is, what is supposed to happen when you pass a NULL handle to dlsym? Will it simply return NULL or will it interpret it as a handle at location 0x0? If the inteded behavior is the latter, then I'll simply add a check to make sure dlopen suceeded. If not, I'd like to know why it randomly returns a function with the same name from an other library if the handle is NULL.

My current use case is that I am loading 10 shared libraries that I made that all have a function ExpectedFunction(). However, if we call dlopen with a filename of a shared library that does not exist, it will return NULL. Then, dlsym will return a pointer to ExpectedFunction() of the last library that was loaded.


Solution

  • My question is, what is supposed to happen when you pass a NULL handle to dlsym?

    The specification says:

    If handle does not refer to a valid object opened by dlopen() ... dlsym() shall return NULL.

    However, there are some reserved handle values that have special behaviour. If you pass such reserved handle, then the behaviour is different. The exact values are unspecified by POSIX, but for example in glibc:

    # define RTLD_NEXT        ((void *) -1l)
    # define RTLD_DEFAULT        ((void *) 0)
    

    (void *) 0 is null, and therefore you accidentally passed RTLD_DEFAULT into dlsym. Of this, the spec says:

    RTLD_DEFAULT

    The symbol lookup happens in the normal global scope; that is, a search for a symbol using this handle would find the same definition as a direct use of this symbol in the program code.

    So, in conclusion, what is supposed to happen depends on whether NULL is a reserved value or not. It happens to be reserved in glibc, but is not necessarily so in other implementations.

    You should check that dlopen does not return null (or check that dlerror does return null) before passing to dlsym.