Search code examples
cgccarmgnu-arm

Different Static Global Variables Share the Same Memory Address


Summary

I have several C source files that all declare individual identically named static global variables. My understanding is that the static global variable in each file should be visible only within that file and should not have external linkage applied, but in fact I can see when debugging that the identically named variables share the same memory address.

It is like the static keyword is being ignored and the global variables are being treated as extern instead. Why is this?

Example Code

foo.c:

/* Private variables -----------------------------------*/
static myEnumType myVar = VALUE_A;

/* Exported functions ----------------------------------*/
void someFooFunc(void) {
    myVar = VALUE_B;
}

bar.c:

/* Private variables -----------------------------------*/
static myEnumType myVar = VALUE_A;

/* Exported functions ----------------------------------*/
void someBarFunc(void) {
    myVar = VALUE_C;
}

baz.c:

/* Private variables -----------------------------------*/
static myEnumType myVar = VALUE_A;

/* Exported functions ----------------------------------*/
void someBazFunc(void) {
    myVar = VALUE_D;
}

Debugging Observations

  1. Set breakpoints on the myVar = ... line inside each function.
  2. Call someFooFunc, someBarFunc, and someBazFunc in that order from main.
  3. Inside someFooFunc myVar initially is set to VALUE_A, after stepping over the line it is set to VALUE_B.
  4. Inside someBarFunc myVar is for some reason initally set to VALUE_B before stepping over the line, not VALUE_A as I'd expect, indicating the linker may have merged the separate global variables based on them having an identical name.
  5. The same goes for someBazFunc when it is called.
  6. If I use the debugger to evaluate the value of &myVar when at each breakpoint the same address is given.

Tools & Flags

Toolchain: GNU ARM GCC (6.2 2016q4)

Compiler options:

arm-none-eabi-gcc -mcpu=cortex-m4 -mthumb -mlong-calls -O1 -fmessage-length=0 -fsigned-char -ffunction-sections -fdata-sections -ffreestanding -fno-move-loop-invariants -Wall -Wextra  -g3 -DDEBUG -DTRACE -DOS_USE_TRACE_ITM -DSTM32L476xx -I"../include" -I"../system/include" -I"../system/include/cmsis" -I"../system/include/stm32l4xx" -I"../system/include/cmsis/device" -I"../foo/inc" -std=gnu11 -MMD -MP -MF"foo/src/foo.d" -MT"foo/src/foo.o" -c -o "foo/src/foo.o" "../foo/src/foo.c"

Linker options:

arm-none-eabi-g++ -mcpu=cortex-m4 -mthumb -mlong-calls -O1 -fmessage-length=0 -fsigned-char -ffunction-sections -fdata-sections -ffreestanding -fno-move-loop-invariants -Wall -Wextra  -g3 -T mem.ld -T libs.ld -T sections.ld -nostartfiles -Xlinker --gc-sections -L"../ldscripts" -Wl,-Map,"myProj.map" --specs=nano.specs -o ...

Solution

  • NOTE: I do understand that OP's target platform is ARM, but nevertheless I'm still posting an answer in terms of x86. The reason is, I have no ARM backend in handy, while the question is not limited to a particular architecture.

    Here's a simple test stand. Note that I'm using int instead of custom enum typedef, since it should not matter at all.

    foo.c

    static int myVar = 1;
    
    int someFooFunc(void)
    {
            myVar += 2;
            return myVar;
    }
    

    bar.c

    static int myVar = 1;
    
    int someBarFunc(void)
    {
            myVar += 3;
            return myVar;
    }
    

    main.c

    #include <stdio.h>
    
    int someFooFunc(void);
    int someBarFunc(void);
    
    int main(int argc, char* argv[])
    {
            printf("%d\n", someFooFunc());
            printf("%d\n", someBarFunc());
            return 0;
    }
    

    I'm compiling it on x86_64 Ubuntu 14.04 with GCC 4.8.4:

    $ g++ main.c foo.c bar.c
    $ ./a.out
    3
    4
    

    Obtaining such results effectively means that myVar variables in foo.c and bar.c are different. If you look at the disassembly (by objdump -D ./a.out):

    000000000040052d <_Z11someFooFuncv>:
      40052d:       55                      push   %rbp
      40052e:       48 89 e5                mov    %rsp,%rbp
      400531:       8b 05 09 0b 20 00       mov    0x200b09(%rip),%eax        # 601040 <_ZL5myVar>
      400537:       83 c0 02                add    $0x2,%eax
      40053a:       89 05 00 0b 20 00       mov    %eax,0x200b00(%rip)        # 601040 <_ZL5myVar>
      400540:       8b 05 fa 0a 20 00       mov    0x200afa(%rip),%eax        # 601040 <_ZL5myVar>
      400546:       5d                      pop    %rbp
      400547:       c3                      retq
    
    0000000000400548 <_Z11someBarFuncv>:
      400548:       55                      push   %rbp
      400549:       48 89 e5                mov    %rsp,%rbp
      40054c:       8b 05 f2 0a 20 00       mov    0x200af2(%rip),%eax        # 601044 <_ZL5myVar>
      400552:       83 c0 03                add    $0x3,%eax
      400555:       89 05 e9 0a 20 00       mov    %eax,0x200ae9(%rip)        # 601044 <_ZL5myVar>
      40055b:       8b 05 e3 0a 20 00       mov    0x200ae3(%rip),%eax        # 601044 <_ZL5myVar>
      400561:       5d                      pop    %rbp
      400562:       c3                      retq   
    

    You can see that the actual addresses of static variables in different modules are indeed different: 0x601040 for foo.c and 0x601044 for bar.c. However, they are associated with a single symbol _ZL5myVar, which really screws up GDB logic.

    You can double-check that by means of objdump -t ./a.out:

    0000000000601040 l     O .data  0000000000000004              _ZL5myVar
    0000000000601044 l     O .data  0000000000000004              _ZL5myVar
    

    Yet again, different addresses, same symbols. How GDB will resolve this conflict is purely implementation-dependent.

    I strongly believe that it's your case as well. However, to be double sure, you might want to try these steps in your environment.