I've been running code on the ARM M0+ core and i see that the vast majority of my time is spent in floating point calculations. So I am experimenting with a custom floating point calculation function for use in very low power applications.
I've been using ARM GCC for bare metal compile on an M0+ (without a hard FPU). I see that floating point multiplication gets linked to __aeabi_fmul and then linked to generate the final ELF file.
My questions are as follows:
I understand that the second part needs me to mess with the compiler. I've been looking into CLANG/LLVM to do this since general consensus seems to be that its easier to modify than GCC! I'm just trying to see if this is even something thats possible or im barking up the entirely wrong tree here.
thank you
It is part of gcc, the gcc library, download the gcc sources and search for those functions and you will find them. They are soft float routines and are hand tuned and you are unlikely to do a significantly better job, but knock yourself out. Not sure why you would do any floating point on an MCU like that but thankfully the language and the tools allow you although it can consume a lot of flash and execution time. (not doing any float variables but doing the floating point math yourself with fixed point is a possible compromise or just do fixed point).
If you use gcc to link then gcc knows where the libraries are and will pull them in automatically, if you use ld to link (using gcc just as a compiler not the caller of everything in the toolchain) then ld does not know where to find the libraries and you can simply add your own object on the command line, this is the simplest way.
You can take the as-is gnu source for a particular function and add it to your project then modify it or just completely replace it with your own function.
Naturally you can go into the compiler sources and rename things then re-build the compiler, not sure just how much work you want to do here, replacing the floating point routines without mistakes is already a large task, as mentioned in comments I would leave the compiler alone and just work with it (leave the names the same link with ld).
start.s
.thumb
.thumb_func
.global _start
_start:
stacktop: .word 0x20001000
.word reset
.word hang
.word hang
.word hang
.thumb_func
reset:
bl notmain
.thumb_func
hang: b .
so.c
float notmain ( float a, float b )
{
return(a+b);
}
memmap
MEMORY
{
rom : ORIGIN = 0x00000000, LENGTH = 0x1000
ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > ram
.bss : { *(.bss*) } > rom
}
build
arm-none-eabi-as start.s -o start.o
arm-none-eabi-gcc -Xlinker -T -Xlinker memmap -nostdlib -nostartfiles -ffreestanding -mthumb start.o so.c -o so.elf -lgcc
arm-none-eabi-objdump -D so.elf
it doesnt complain but makes a perfectly broken binary
20000048 <__addsf3>:
20000048: e1b02080 lsls r2, r0, #1
2000004c: 11b03081 lslsne r3, r1, #1
20000050: 11320003 teqne r2, r3
20000054: 11f0cc42 mvnsne r12, r2, asr #24
20000058: 11f0cc43 mvnsne r12, r3, asr #24
2000005c: 0a000047 beq 20000180 <__addsf3+0x138>
20000060: e1a02c22 lsr r2, r2, #24
20000064: e0723c23 rsbs r3, r2, r3, lsr #24
20000068: c0822003 addgt r2, r2, r3
2000006c: c0201001 eorgt r1, r0, r1
20000070: c0210000 eorgt r0, r1, r0
those are arm instructions not thumb. examining what the linker was passed.
0:[/opt/gnuarm/lib/gcc/arm-none-eabi/7.1.0/../../../../arm-none-eabi/bin/ld]
1:[-plugin]
2:[/opt/gnuarm/libexec/gcc/arm-none-eabi/7.1.0/liblto_plugin.so]
3:[-plugin-opt=/opt/gnuarm/libexec/gcc/arm-none-eabi/7.1.0/lto-wrapper]
4:[-plugin-opt=-fresolution=/tmp/ccSyISCJ.res]
5:[-X]
6:[-o]
7:[so.elf]
8:[-L/opt/gnuarm/lib/gcc/arm-none-eabi/7.1.0/thumb]
9:[-L/opt/gnuarm/lib/gcc/arm-none-eabi/7.1.0]
10:[-L/opt/gnuarm/lib/gcc/arm-none-eabi/7.1.0/../../../../arm-none-eabi/lib]
11:[-T]
12:[memmap]
13:[start.o]
14:[/tmp/ccrdRU2s.o]
15:[-lgcc]
the other approach
arm-none-eabi-gcc -O2 -c -mthumb so.c -o so.o
arm-none-eabi-ld -T memmap start.o so.o /opt/gnuarm/lib/gcc/arm-none-eabi/7.1.0/thumb/libgcc.a -o so.elf
but this is still broken
20000038 <__addsf3>:
20000038: e1b02080 lsls r2, r0, #1
2000003c: 11b03081 lslsne r3, r1, #1
20000040: 11320003 teqne r2, r3
20000044: 11f0cc42 mvnsne r12, r2, asr #24
20000048: 11f0cc43 mvnsne r12, r3, asr #24
2000004c: 0a000047 beq 20000170 <__addsf3+0x138>
20000050: e1a02c22 lsr r2, r2, #24
20000054: e0723c23 rsbs r3, r2, r3, lsr #24
I have not done the things I need to do to get the right library, have to run will re-edit this later...
But my proposed solution is:
.thumb_func
.globl __aeabi_fadd
__aeabi_fadd:
bx lr
I added to start.s for demonstration purposes
arm-none-eabi-as start.s -o start.o
arm-none-eabi-ld -T memmap start.o so.o -o so.elf
arm-none-eabi-objdump -D so.elf
Disassembly of section .text:
20000000 <_start>:
20000000: 20001000 andcs r1, r0, r0
20000004: 20000015 andcs r0, r0, r5, lsl r0
20000008: 20000019 andcs r0, r0, r9, lsl r0
2000000c: 20000019 andcs r0, r0, r9, lsl r0
20000010: 20000019 andcs r0, r0, r9, lsl r0
20000014 <reset>:
20000014: f000 f802 bl 2000001c <notmain>
20000018 <hang>:
20000018: e7fe b.n 20000018 <hang>
2000001a <__aeabi_fadd>:
2000001a: 4770 bx lr
2000001c <notmain>:
2000001c: b510 push {r4, lr}
2000001e: f7ff fffc bl 2000001a <__aeabi_fadd>
20000022: bc10 pop {r4}
20000024: bc02 pop {r1}
20000026: 4708 bx r1
then fill in whatever you want, clearly this is not a real program, broke many rules, there are no numbers being passed in, etc...
But the compiler generated __aeabi_fadd and I supplied an __aeabi_fadd and it was happy.
What I have done in the past, is, since I build my own gnu toolchain anyway, go in and put a syntax error in the file of interest, do the build, then the long command line used to build that item is now on the screen when it fails, isolate the function of interest, use the long command line for gcc as a guide, tweak and tune as desired...Get there faster than trying to figure out all the defines on your own in the code.