myfunction:
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
mul r3, r0, r0
mov r0, r3
mla r0, r1, r0, r2
bx lr
I am able to generate everything except for the mov instruction using following C function.
int myfunction(int r0, int r1, int r2, int r3)
{
r3 = r0*r0;
r0 = r3;
r3 = r0;
return (r1*r3)+r2;
}
How can I instruct r3 to be set to the address of r0 in assembly code?
unsigned int myfunction(unsigned int a, unsigned int b, unsigned int c)
{
return (a*a*b)+c;
}
Your choices are going to be something like this
00000000 <myfunction>:
0: e52db004 push {r11} ; (str r11, [sp, #-4]!)
4: e28db000 add r11, sp, #0
8: e24dd014 sub sp, sp, #20
c: e50b0008 str r0, [r11, #-8]
10: e50b100c str r1, [r11, #-12]
14: e50b2010 str r2, [r11, #-16]
18: e51b3008 ldr r3, [r11, #-8]
1c: e51b2008 ldr r2, [r11, #-8]
20: e0010392 mul r1, r2, r3
24: e51b200c ldr r2, [r11, #-12]
28: e0000291 mul r0, r1, r2
2c: e51b3010 ldr r3, [r11, #-16]
30: e0803003 add r3, r0, r3
34: e1a00003 mov r0, r3
38: e28bd000 add sp, r11, #0
3c: e49db004 pop {r11} ; (ldr r11, [sp], #4)
40: e12fff1e bx lr
or this
00000000 <myfunction>:
0: e0030090 mul r3, r0, r0
4: e0202391 mla r0, r1, r3, r2
8: e12fff1e bx lr
as you have probably figured out.
The mov should never be considered by the compiler backend as it just wastes an instruction. r3 goes into the mla no need to put it in r0 then do the mla. Not quite sure how to get the compiler to do more. Even this doesn't encourage it
unsigned int fun ( unsigned int a )
{
return(a*a);
}
unsigned int myfunction(unsigned int a, unsigned int b, unsigned int c)
{
return (fun(a)*b)+c;
}
giving
00000000 <fun>:
0: e1a03000 mov r3, r0
4: e0000093 mul r0, r3, r0
8: e12fff1e bx lr
0000000c <myfunction>:
c: e0030090 mul r3, r0, r0
10: e0202391 mla r0, r1, r3, r2
14: e12fff1e bx lr
Basically if you don't optimize you get nowhere near what you were after. If you optimize that mov shouldn't be there, should be easy to optimize out.
While some level of manipulation of writing high level code to encourage the compiler to output low level code is possible, trying to get this exact output is not something you should expect to be able to do.
Unless you use inline asm
asm
(
"mul r3, r0, r0\n"
"mov r0, r3\n"
"mla r0, r1, r0, r2\n"
"bx lr\n"
);
giving your result
Disassembly of section .text:
00000000 <.text>:
0: e0030090 mul r3, r0, r0
4: e1a00003 mov r0, r3
8: e0202091 mla r0, r1, r0, r2
c: e12fff1e bx lr
or real asm
mul r3, r0, r0
mov r0, r3
mla r0, r1, r0, r2
bx lr
and feed it into gcc rather than as (arm-whatever-gcc so.s -o so.o)
Disassembly of section .text:
00000000 <.text>:
0: e0030090 mul r3, r0, r0
4: e1a00003 mov r0, r3
8: e0202091 mla r0, r1, r0, r2
c: e12fff1e bx lr
so that technically you were using gcc on the command line but gcc does some preprocessing and then feeds it to as.
Unless you find a core or where Rd and Rs have to be the same register and can then specify that core/bug/whatever on the gcc command line, I don't see the mov happening, maybe, just maybe, with clang/llvm compile fun and myfunction separately to bytecode then combine them then optimize then output to the target then examine that. I would hope either in the optimization or the output that the mov would be optimized out but you might get lucky.
I made an error:
unsigned int myfunction(unsigned int a, unsigned int b, unsigned int c)
{
return (a*a*b)+c;
}
arm-linux-gnueabi-gcc --version
arm-linux-gnueabi-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Disassembly of section .text:
00000000 <myfunction>:
0: e0030090 mul r3, r0, r0
4: e1a00003 mov r0, r3
8: e0202091 mla r0, r1, r0, r2
c: e12fff1e bx lr
but this
arm-none-eabi-gcc --version
arm-none-eabi-gcc (GCC) 8.2.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
arm-none-eabi-gcc -O2 -c so.c -o so.o
arm-none-eabi-objdump -D so.o
so.o: file format elf32-littlearm
Disassembly of section .text:
00000000 <myfunction>:
0: e0030090 mul r3, r0, r0
4: e0202391 mla r0, r1, r3, r2
8: e12fff1e bx lr
I'll have to build a 7.3 or go find one. Somewhere between 5.x.x and 8.x.x the backend changed or...
Note you may need -mcpu=arm7tdmi or -mcpu=arm9tdmi or -march=armv4t or -march=armv5t on the command line depending on the default target (cpu/arch) built into your compiler. Or you might get something like this
Disassembly of section .text:
00000000 <myfunction>:
0: fb00 f000 mul.w r0, r0, r0
4: fb01 2000 mla r0, r1, r0, r2
8: 4770 bx lr
a: bf00 nop
this
arm-none-eabi-gcc --version
arm-none-eabi-gcc (GCC) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
produces
Disassembly of section .text:
00000000 <myfunction>:
0: e0030090 mul r3, r0, r0
4: e0202391 mla r0, r1, r3, r2
8: e12fff1e bx lr
So you may have to work backward to find the version where it changed, the source code change to gcc that caused it and modify 7.3.0 making something that is not really 7.3.0 but reports as 7.3.0 and outputs your desired code.