Heap and Stack allocation in ThreadX RTOS

Recently I started learning ThreadX RTOS and I noticed that in the linker script and crt0.S provided for Cortex-M4 with gcc toolchain, .stack and .heap sections are allocated with size 1024 bytes and 128 bytes respectively.

After writing up a simple program which statically creates 2 threads on byte pool, I then ran objdump -t program.elf. It turned out the memory allocated for the byte pool, thread control block and other ThreadX variables, pointers are in the .bss section.

I was wondering what is the purpose of creating .stack and .heap section. Are they there just in case dynamic memory allocation function such as malloc from C standard library (newlib) is invoked?

Solution

The .stack memory is the stack used by the main() thread and on Cortex-M at least also by interrupt handlers. It is the system stack.

Once the scheduler starts, the main thread is done, but the stack needs to be large enough to support any start up initialisation, and also worst case interrupt handler nesting.

It may be a good idea in some cases to do as little as possible in the main thread to minimise stack requirement, and start the scheduler with an initial "root thread" that creates other threads, then either terminates and recovers it's stack space, or performs some useful background process.

The .heap is required to support dynamic memory allocation. Newlib's printf and related functions and others such as strtok use dynamic memory, so you need it if you invoke those, and most RTOS (not that familiar with ThreadX) provide an option to dynamically allocate thread stacks.

The default heap allocator had non-deterministic timing characteristics, so best avoided in hard real time threads.

Also be aware that malloc/free in Newlib are not by default thread safe. ThreadX or your toolchain may already provide them, but you need to be sure that the functions declared in sys/locks.h have suitable overrides (using a mutex for example).

Caution though; I have seen some terrible implementations from vendors who should know better that implement the locks by treating malloc/free as critical sections by disabling interrupts and/or suspending the scheduler. That affects the real time performance and behaviour of every task, whether it uses malloc or not. Using a mutex limits the effect to only those tasks using dynamic allocation, which as advised should not be hard real-time sections in any case.

Often it is advisable to avoid dynamic memory allocation in any event in memory constrained, safety-critical and real-time systems. In that case you will also have to avoid calls such as printf, sprintf et al, which are often also quite stack hungry. There are any number if cut down "tiny printf" implementations you might use instead. If you remove the sbrk and sbrk_r syscall implementations from Newlib, any attempt to use malloc will fail to link. It may not be that easy though if for example ThreadX relies in it. You may have no choice but to provide at least some heap.

If your application requires deterministic memory management, then you should use whatever services it provides for that, such as ThreadX byte or block pools. For an RTOS without native support, implementing a block pool is trivial - you simply create an RTOS queue of pointers to memory blocks (in a static array from example), then take from the queue to allocate and return to the queue to free.