I'm currently trying to compile some code to work on AVR (ATMEGA2560) and it looks like I'm running out of RAM.
I looked at the listing (generated with avr-objdump -x -S project.elf
) and I'm finding that .data
is way too big to go into the 8kb RAM (it's around 12k) - in fact the 'real' RAM contents start at 0x802FEE which is in the address space for 'external RAM'.
I see:
Sections:
Idx Name Size VMA LMA File off Algn
0 .data 00003088 00800200 0002297a 00022a0e 2**0
CONTENTS, ALLOC, LOAD, DATA
1 .text 0002297a 00000000 00000000 00000094 2**1
CONTENTS, ALLOC, LOAD, READONLY, CODE
2 .bss 00000755 00803288 00803288 00025a98 2**2
ALLOC
3 .stab 00001a7c 00000000 00000000 00025a98 2**2
CONTENTS, READONLY, DEBUGGING
4 .stabstr 00000d4d 00000000 00000000 00027514 2**0
CONTENTS, READONLY, DEBUGGING
I then grepped for .data
symbols and sorted based on address:
grep "\.data" project.lst | sort
00800200 g .data 00000000 __data_start
00800200 l d .data 00000000 .data
00802fee g O .data 00000008 __thenan_sf
00802ff6 g O .data 00000100 __clz_tab
008030f6 l O .data 00000004 next
... lots of stuff in here ....
00803267 l O .data 00000010 CSWTCH.18
00803277 w O .data 00000010 _ZTV14HardwareSerial
00803288 g .data 00000000 __data_end
00803288 g .data 00000000 _edata
So .data
is supposed to start at 0x800200, but for some reason the first symbol is at 00802fee
- which is well out of the address range.
I tried -Wl,--section-start,.data=0x800000,--defsym=__heap_end=0x8021FF
but this only moves things back by 0x200
as expected - there's still something at the start of .data
that is pushing everything out.
Does anyone know what this is, or why it's happening? It's annoying because everything should fit in if it weren't for that.
Well, OP already found out (with the help of Olaf's comments concerning where to start looking) what was actually taking up the space, but indeed, there should be an answer, so trying to summarize here:
The space in the data segment is also occupied by anonymous data (literals in code), especially string literals.
The tricky thing here is: avr
is a Harvard architecture which means there are different/independent address spaces for code and data, contrary to the widespread von Neumann architecture with only one unified address space. c was designed for the latter, so it's only possible to handle one address space. On avr
chips, the data address space is backed by RAM, the code address space by flash memory.
Now, if running low on RAM, it would be sensible to place read only data in flash, too, but there's a catch: Given a function taking a char *
, the compiler will translate this function assuming the pointer is meant to point to data address space and emit assembly fetching from there. Therefore, another similar function is needed that looks in code address space instead.
The solution in avr-gcc
and avr-libc
to this problem is to provide the PROGMEM
qualifier, so the compiler knows data qualified this way should live in program memory (code address space). There's also a convenience macro PSTR()
to make string literals live in program memory without the need to introduce another PROGMEM
-qualified identifier. To work with these, avr-libc
has a few standard functions postfixed _P
(like, e.g., puts_P()
) that do exactly the same, but expect their arguments in program memory. It's a bit of a hassle to use it that way, but it's just impossible to handle this transparently without c knowing about different address spaces.
As RAM is often quite small on avr
chips, there are a few other things you can do to conserve it, like e.g. not using a heap at all (in many embedded programs, you don't really need dynamic allocation, just think about it), using bitfields where appropriate, have the compiler "pack" all structs (-fpack-struct
option), always use the smallest possible size for enums (-fshort-enums
option), always use uint8_t
for small integers, use bitfields (or encode state with bit masking/shifting) where appropriate, and so on.