Search code examples
clinkerclang

clang not recognizing unitialized pointer found in static library


I've found a curiosity when compiling with clang (on a MacBook, if it helps). Suppose I have two files:

blah.c

int *p;

main.c

#include <stdio.h>

extern int *p;

int main() {
    printf("%p\n", p);
    return 0;
}

If I compile with

clang blah.c main.c

everything works out fine. However, if I do

clang -c blah.c
ar rcs libblah.a blah.o
clang main.c libblah.a

I get a linker error:

Undefined symbols for architecture x86_64:
  "_p", referenced from:
      _main in test-4bf0d6.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Interestingly, if I initialize the variable in blah.c,

#include <stddef.h>

int *p = NULL;

the error goes away.

Also, compiling with gcc doesn't produce this behavior. What exactly is going on with clang here?

Here's the output from clang --version:

Apple clang version 13.0.0 (clang-1300.0.29.30)
Target: x86_64-apple-darwin21.2.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

Solution

  • What exactly is going on with clang here?

    TL;DR: Your Clang has a bug. You can probably work around it without modifying your code by adding -fno-common to your compile options.


    More detail

    Both variations of your code are correct, and as far as the C language specification is concerned, they have the same meaning. On my Linux machine, GCC 8.5 and Clang 12 both accept both variations and successfully build working executables, whether blah.o is linked directly or from a library.

    But if you use nm to examine the library built with and without the initializer for p, you will likely get a hint about what is happening. Without an initializer, I see (with either compiler) that p has type 'C' (common). With an initializer (to null), I see that it has type 'B' (BSS).

    That is reflective of a traditional behavior of Unix C implementations: to merge multiple definitions of the same symbol as long as no more than one is defined with an explicit initializer. That is an extension to standard C, for the language requires that there be exactly one definition of each external symbol that a program references. Among other things, that extension covers the common error of omitting extern from variable declarations in headers, provided that the header does not specify an initializer.

    To implement that, the toolchain needs to distinguish between symbols defined with an explicit initializer and those defined without, and that's where (for C) symbol type "common" comes in -- it is used to convey a symbol that is defined, but without an explicit initializer. Typical linker behavior would be to treat all such symbols as undefined ones if one of the objects being linked has a definition for that symbol with a different type, or else to treat all but one of them as undefined, and the other as having type B (implying default initializtion).

    But the MacOS development toolchain seems to have hatched a bug. In your example, it is erroneously failing to recognize the type C symbol as a viable definition when that appears in a library. The issue might be either in the Clang front end or in the system linker, or in a combination of both. Perhaps this arrived together with Apple's recent tightening (and subsequent re-loosening) of the compiler's default conformance settings.

    You can probably work around this issue by adding --fno-common to your C compiler flags. GCC and Clang both accept that for disabling the symbol merging described above, and, at least on my machine, they both implement that by emitting the symbol as type B when it is defined without an explicit initializer, just as if it had been explicitly initialized to a null pointer. Note well, however, that this will break any code that is presently relying on that merging behavior.