where is aeabi_fmul being linked from?

I've been running code on the ARM M0+ core and i see that the vast majority of my time is spent in floating point calculations. So I am experimenting with a custom floating point calculation function for use in very low power applications.

I've been using ARM GCC for bare metal compile on an M0+ (without a hard FPU). I see that floating point multiplication gets linked to __aeabi_fmul and then linked to generate the final ELF file.

My questions are as follows:

Where is __aeabi_fmul defined? Is it in a pre-compiled library that comes with GCC?
Is it possible to change this definition in some way? Maybe have a pre-compiled version of my_fp_mul instead and link to that instead of __aeabi_fmul?

I understand that the second part needs me to mess with the compiler. I've been looking into CLANG/LLVM to do this since general consensus seems to be that its easier to modify than GCC! I'm just trying to see if this is even something thats possible or im barking up the entirely wrong tree here.

thank you

Solution

It is part of gcc, the gcc library, download the gcc sources and search for those functions and you will find them. They are soft float routines and are hand tuned and you are unlikely to do a significantly better job, but knock yourself out. Not sure why you would do any floating point on an MCU like that but thankfully the language and the tools allow you although it can consume a lot of flash and execution time. (not doing any float variables but doing the floating point math yourself with fixed point is a possible compromise or just do fixed point).

If you use gcc to link then gcc knows where the libraries are and will pull them in automatically, if you use ld to link (using gcc just as a compiler not the caller of everything in the toolchain) then ld does not know where to find the libraries and you can simply add your own object on the command line, this is the simplest way.

You can take the as-is gnu source for a particular function and add it to your project then modify it or just completely replace it with your own function.

Naturally you can go into the compiler sources and rename things then re-build the compiler, not sure just how much work you want to do here, replacing the floating point routines without mistakes is already a large task, as mentioned in comments I would leave the compiler alone and just work with it (leave the names the same link with ld).

start.s

.thumb
.thumb_func
.global _start
_start:
stacktop: .word 0x20001000
.word reset
.word hang
.word hang
.word hang
.thumb_func
reset:
    bl notmain
.thumb_func
hang:   b .

so.c

float notmain ( float a, float b )
{
    return(a+b);
}

memmap

MEMORY
{
    rom : ORIGIN = 0x00000000, LENGTH = 0x1000
    ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
    .text : { *(.text*) } > ram
    .bss : { *(.bss*) } > rom
}

build

arm-none-eabi-as start.s -o start.o
arm-none-eabi-gcc -Xlinker -T -Xlinker memmap -nostdlib -nostartfiles -ffreestanding -mthumb start.o so.c -o so.elf -lgcc
arm-none-eabi-objdump -D so.elf

it doesnt complain but makes a perfectly broken binary

20000048 <__addsf3>:
20000048:   e1b02080    lsls    r2, r0, #1
2000004c:   11b03081    lslsne  r3, r1, #1
20000050:   11320003    teqne   r2, r3
20000054:   11f0cc42    mvnsne  r12, r2, asr #24
20000058:   11f0cc43    mvnsne  r12, r3, asr #24
2000005c:   0a000047    beq 20000180 <__addsf3+0x138>
20000060:   e1a02c22    lsr r2, r2, #24
20000064:   e0723c23    rsbs    r3, r2, r3, lsr #24
20000068:   c0822003    addgt   r2, r2, r3
2000006c:   c0201001    eorgt   r1, r0, r1
20000070:   c0210000    eorgt   r0, r1, r0

those are arm instructions not thumb. examining what the linker was passed.

0:[/opt/gnuarm/lib/gcc/arm-none-eabi/7.1.0/../../../../arm-none-eabi/bin/ld]
1:[-plugin]
2:[/opt/gnuarm/libexec/gcc/arm-none-eabi/7.1.0/liblto_plugin.so]
3:[-plugin-opt=/opt/gnuarm/libexec/gcc/arm-none-eabi/7.1.0/lto-wrapper]
4:[-plugin-opt=-fresolution=/tmp/ccSyISCJ.res]
5:[-X]
6:[-o]
7:[so.elf]
8:[-L/opt/gnuarm/lib/gcc/arm-none-eabi/7.1.0/thumb]
9:[-L/opt/gnuarm/lib/gcc/arm-none-eabi/7.1.0]
10:[-L/opt/gnuarm/lib/gcc/arm-none-eabi/7.1.0/../../../../arm-none-eabi/lib]
11:[-T]
12:[memmap]
13:[start.o]
14:[/tmp/ccrdRU2s.o]
15:[-lgcc]

the other approach

arm-none-eabi-gcc -O2 -c -mthumb so.c -o so.o
arm-none-eabi-ld -T memmap start.o so.o /opt/gnuarm/lib/gcc/arm-none-eabi/7.1.0/thumb/libgcc.a  -o so.elf

but this is still broken

20000038 <__addsf3>:
20000038:   e1b02080    lsls    r2, r0, #1
2000003c:   11b03081    lslsne  r3, r1, #1
20000040:   11320003    teqne   r2, r3
20000044:   11f0cc42    mvnsne  r12, r2, asr #24
20000048:   11f0cc43    mvnsne  r12, r3, asr #24
2000004c:   0a000047    beq 20000170 <__addsf3+0x138>
20000050:   e1a02c22    lsr r2, r2, #24
20000054:   e0723c23    rsbs    r3, r2, r3, lsr #24

I have not done the things I need to do to get the right library, have to run will re-edit this later...

But my proposed solution is:

.thumb_func
.globl __aeabi_fadd
__aeabi_fadd:
    bx lr

I added to start.s for demonstration purposes

arm-none-eabi-as start.s -o start.o
arm-none-eabi-ld -T memmap start.o so.o -o so.elf
arm-none-eabi-objdump -D so.elf

Disassembly of section .text:

20000000 <_start>:
20000000:   20001000    andcs   r1, r0, r0
20000004:   20000015    andcs   r0, r0, r5, lsl r0
20000008:   20000019    andcs   r0, r0, r9, lsl r0
2000000c:   20000019    andcs   r0, r0, r9, lsl r0
20000010:   20000019    andcs   r0, r0, r9, lsl r0

20000014 <reset>:
20000014:   f000 f802   bl  2000001c <notmain>

20000018 <hang>:
20000018:   e7fe        b.n 20000018 <hang>

2000001a <__aeabi_fadd>:
2000001a:   4770        bx  lr

2000001c <notmain>:
2000001c:   b510        push    {r4, lr}
2000001e:   f7ff fffc   bl  2000001a <__aeabi_fadd>
20000022:   bc10        pop {r4}
20000024:   bc02        pop {r1}
20000026:   4708        bx  r1

then fill in whatever you want, clearly this is not a real program, broke many rules, there are no numbers being passed in, etc...

But the compiler generated __aeabi_fadd and I supplied an __aeabi_fadd and it was happy.

What I have done in the past, is, since I build my own gnu toolchain anyway, go in and put a syntax error in the file of interest, do the build, then the long command line used to build that item is now on the screen when it fails, isolate the function of interest, use the long command line for gcc as a guide, tweak and tune as desired...Get there faster than trying to figure out all the defines on your own in the code.