Search code examples
cgccld

Why doesn't a linked binary file's _size symbol work correctly?


I use 'ld -r -b binary -o binary.o foo.jpeg' to embed resources in my program. Works awesomely. I just wonder why the int _binary_size symbol never reads correctly, negative or too large a number, but stays the same between program runs. I always gotta do _binary_end - _binary_start, which works flawlessly. It's seems it works for no one... like here .... why is that?

There is no reason not to use end-start as it replaces the size symbol, but it still leaves me curious.

edit: code example.

extern const unsigned char _binary_scna4_jpg_start;
extern const unsigned char _binary_scna4_jpg_end;
extern const int _binary_scna4_jpg_size;

int size = &_binary_scna4_jpg_end - &_binary_scna4_jpg_start;
printf("Size is %d vs %d \n", size, _binary_scna4_jpg_size);

this prints:

Size is 1192071 vs -385906356 

First number is the correct size of the binary and all my images read flawlessly.

Output of nm for good measure:

0000000000123087 D _binary_scna4_jpg_end
0000000000123087 A _binary_scna4_jpg_size
0000000000000000 D _binary_scna4_jpg_start

Solution

  • The problem arises because of Position-Independent Executables (PIE). Earlier executables were loaded at the same memory addresses (which were determined at compile/link time) which led to possible attacks because the attacker knew at which address specific parts of programs were. Therefore Address Space Layout Randomization was implemented. This has the side effect that the size symbols being defined as absolute addresses (the _binary_scna4_jpg_size is not an integer value, it's a "pointer" just like _start and _end) also get relocated when they are loaded.

    If you compile your code with option -no-pie you can disable position-independence and the _binary_scna4_jpg_size will output the correct value since it will not be relocated. Since PIE is on by default these days the value of the pointer is basically garbage. You could also use it if you knew the beginning of the relocated memory, but since you already have _binary_scna4_jpg_start and _binary_scna4_jpg_end it's the same thing to use them.