I was trying to use AVX in a Mandelbrot program and it's not working right.
I try debugging it but GDB refuses to show me the floating point values in the YMM registers. Here's the minimum example
t.c
#include <stdio.h>
extern void loadnum(void);
extern double input[4];
extern double output[4];
int main(void)
{
/*
input[0] = 1.1;
input[1] = 2.2;
input[2] = 3.3;
input[3] = 3.14159;
*/
printf("%f %f %f %f\n",input[0],input[1],input[2],input[3]);
loadnum();
printf("%f %f %f %f\n",output[0],output[1],output[2],output[3]);
return 0;
}
l.asm
section .data
global input
global output
align 64
input dq 1.1,2.2,3.3,3.14159
output dq 0,0,0,0
section .text
global loadnum
loadnum:
vmovapd ymm0, [input]
vmovapd [output],ymm0
ret
how it's compiled
OBJECTS = t.o l.o
CFLAGS = -c -O2 -g -no-pie -mavx -Wall
t: $(OBJECTS)
gcc -g -no-pie $(OBJECTS) -o t
t.o: t.c
gcc $(CFLAGS) t.c
l.o: l.asm
nasm -felf64 -gdwarf l.asm
The output is
> 1.100000 2.200000 3.300000 3.141590
> 1.100000 2.200000 3.300000 3.141590
which shows it's loading and storing these doubles as expected, but in gdb it shows
> gdb t (followed by some boilerplate)
> Reading symbols from t...
> (gdb) b loadnum
> Breakpoint 1 at 0x4011b0: file l.asm, line 15.
> (gdb) run
> Starting program: /somedir/t
> 1.100000 2.200000 3.300000 3.141590
> Breakpoint 1, loadnum () at l.asm:15
> 15 vmovapd ymm0, [input]
> (gdb) n
> 16 vmovapd [output],ymm0
> (gdb)
then I say
> (gdb) info all-registers
and this shows up.
> ymm0 (blah blah) v4_double = {0x1, 0x2, 0x3, 0x3}
when I expected it to show
> ymm0 (blah blah) v4_double = {1.100000 2.200000 3.300000 3.141590}
None of the other fields show anything like that, unless you want to parse the floating point bits
> v4_int64 = {0x3ff199999999999a, 0x400199999999999a, 0x400a666666666666, 0x400921f9f01b866e}
How can I fix this?
p $ymm0.v4_double
(the print
command) defaults to decimal formatting.
Use p /whatever
for other formats, like p /x $ymm0.v4_int64
to see hex for the bit-patterns. help p
for more.
display $ymm0.v4_double
can work as a stand-in for layout reg
+ tui reg vec
being buggy/broken in some versions, and always an unusable mess of different formats for registers as wide and numerous as ymm0-15. It takes the same options as p
rint, and prints before every prompt. (undisplay 1
or undisplay
(all) to disable some of the expressions you've set up.)
It can get cluttered in TUI mode (layout asm
or layout reg
+ layout next
to see integer regs and disassembly) if you want to track more than a couple registers, so you might prefer to use non-TUI mode, either don't use layout
in the first place, or tui dis
.
(When debugging hand-written asm, I almost always want to look at disassembly, not source; but maybe for a complicated algorithm I'd sometimes want to see source with comments as a reminder of what the values should be/mean at a certain point.)