Search code examples
clinkerlinker-errorsesp32esp-idf

ESP-IDF linker not reporting duplicate functions so generating unsafe binary


I wrote a .c file (not c++) that contains a function called close(). It was public i.e. not static. I realise this was a foolish choice of function name, but assumed the linker would tell me if there was any clash. And indeed in an unrelated part of the code I was using sockets #include "lwip/sockets.h" and calling the close() function that was defined there, and which had completely different arguments.

I built the system with idf.py build and there was no error or warning.

But at runtime it crashed when my sockets code called close() and it was evident from the stack trace that it had called my close() function not the intended one from lwip/sockets.h.

Obviously I can easily fix this now I have seen what it is, but I have lost trust in the linker. Surely it should have given some duplicate symbol warning?! Given that idf.py build seems to compile over 1000 files there is a lot of stuff in the namespace, so an automated check seems essential.

My real concern is that there could be a clash that might not cause a runtime exception, but instead undefined and unwanted behaviour that would be very hard to debug.

On other C platforms (Microchip compilers and several others) I have not encountered such an issue. I think it is because these have all required me to list all the .obj and library files that are to be linked, so the linker can see duplicates. But the ESP-IDF platform only needs me to list my objects and it somehow finds the rest.


Solution

  • You need to understand a bit better what a linker does (absolutely nothing is wrong with yours, it does exactly what it should.)

    1. Collect all symbols from all object files (and the C startup).
    2. Resolve the symbols between the object files
    3. All symbols still unresolved at this point, try and resolve them from given libraries.

    i.e. the linker will, at the end of the object link phase, only start looking for symbols that are not yet resolved, and use libraries for those. In your case, the request for close() could be satisfied from object files which are handled with precedence, and the linker didn't even search for close() in the libraries. So, in short: The linker only takes what it needs from libraries. This is to ensure that the absolute minimum of object files contained in libraries is included in the resulting binary.

    The order in which free-standing objects and objects in libraries are pulled in typically depends on the order the objects show up in the command line. By changing the order of objects and libraries, you can possibly make the linker warn you of such occurrences - but you should know that you should only try and override system functions if you know what you're doing.