Search code examples
linuxdlopensigfpe

On linux, what can cause dlopen to emit SIGFPE?


I have a library of dubious origins which is identified by file as a 32 bit executable. However, when I try to dlopen it on a 32 bit CentOS 4.4 machine, dlopen terminates with SIGFPE. Surely if there was something wrong with the format of the binary then dlopen should be handling an error?

So the question is: What kinds of problems can cause dlopen to emit SIGFPE?


Solution

  • Some possible reasons are:

    1. Division by zero (rule this out with gdb)
    2. Architecture mismatch (did you compile the DSO yourself on the same architecture? or is it prebuilt?)
    3. ABI compatibility problems (loading a DSO built for one Linux distro on a different one).

    Here is an interesting discussion regarding hash generation in the ELF format in GNU systems where an ABI mismatch can cause SIGFPE on systems when you mix and match DSOs not built on that distro/system.

    Run GDB against your executable with:

    ]$ gdb ./my_executable
    (gdb) run
    

    When the program crashes, get a backtrace with

    (gdb) bt
    

    If the stack ends in do_lookup_x () then you likely have the same problem and should ensure your DSO is correct for the system you are trying to load it on ... However you do say it has dubious origins so the problem is probably an ABI problem similar to the one described.

    Get a non-dubious library / executable! ;)

    Good Luck