assembly syntax macros c-preprocessor gnu-assembler

I cant traduce this chunk of GAS code to INTEL/NASM syntax

in this code:

#define G(gi1, gi2, x, t0, t1, t2, t3) \
lookup_32bit(t0, t1, t2, t3, ##gi1, RGS1, shr_next, ##gi1);  \
lookup_32bit(t0, t1, t2, t3, ##gi2, RGS3, shr_next, ##gi2);  \
\
lookup_32bit(t0, t1, t2, t3, ##gi1, RGS2, dummy, none);      \
shlq $32,   RGS2;                                        \
orq     RGS1, RGS2;                                  \
lookup_32bit(t0, t1, t2, t3, ##gi2, RGS1, dummy, none);      \
shlq $32,   RGS1;                                        \
orq     RGS1, RGS3;

#define lookup_32bit(t0, t1, t2, t3, src, dst, interleave_op, il_reg) \
movzbl      src ## bl,        RID1d;     \
movzbl      src ## bh,        RID2d;     \
shrq $16,   src;                         \
movl        t0(CTX, RID1, 4), dst ## d;  \
movl        t1(CTX, RID2, 4), RID2d;     \
movzbl      src ## bl,        RID1d;     \
xorl        RID2d,            dst ## d;  \
movzbl      src ## bh,        RID2d;     \
interleave_op(il_reg);               \
xorl        t2(CTX, RID1, 4), dst ## d;  \
xorl        t3(CTX, RID2, 4), dst ## d;

"gi1" becomes RDX, in the beginning, but furthermore I can't translate it regard of its usage in the "movzbl" instruction. Basically I can't figure out the movzbl ??? ???, RID1d I am NASM user

full code here: https://github.com/torvalds/linux/blob/master/arch/x86/crypto/twofish-avx-x86_64-asm_64.S

Solution

I'm still slightly confused by the uses of ## in G. I found a section of the GNU cpp manual which mentions ## after a comma, but it's meant for use in variadic macros, and this isn't one of those.

But I'm going ahead with an explanation anyway, based on the assumption that those ## are not doing anything.

The ## in lookup_32bit, on the other hand, are perfectly normal and necessary.

Let's go up a level from the G macro and see how it's called. One of its calls looksl ike this:

G(RGI1, RGI2, x1, s0, s1, s2, s3)

Its first argument, RGI1, becomes gi1 in the expansion. The first piece of the G macro:

lookup_32bit(t0, t1, t2, t3, ##gi1, RGS1, shr_next, ##gi1)

expands lookup_32bit with ##gi1 as the 5th and 8th arguments. I'm assuming ##gi1 works the same as gi1, so the 5th and 8th arguments will be RGI1.

Inside the lookup_32bit macro, the 5th and 8th arguments are called src and il_reg, so both of those will expand to RGI1 in this instance. The first instruction in lookup_32bit:

movzbl      src ## bl,        RID1d;

pastes the src argument (RGI1) together with bl (which is not a macro or a macro argument, so it just represents itself), resulting in the pasted token RGI1bl. The instruction now looks like this:

movzbl      RGI1bl,        RID1d;

After the first pass of expanding lookup_32bit is done, the preprocessor will look again for macros to expand, and RGI1bl is a macro defined like this:

#define RGI1bl %dl

Also, RID1d is a macro defined like this:

#define RID1d %ebp

so the instruction ends up being:

movzbl      %dl,        %ebp;

and that's just a zero-extending move from 8-bit register %dl to 32-bit register `%ebp.

Looking at the other macros, you can see that there are a bunch of them starting with RGI1 all of which resolve to %rdx or portions of it. With these macros in place, selecting the low 8-bit portion of a 64-bit register can be done by pasting bl onto the end with ##, which wouldn't be possible using the native register names directly (there's no preprocessor operation as sophisticated as "remove the r from the front of this token and change the final x to an l).

The specific names RGI1, RID1, etc. don't look familiar to me. I'll guess they are derived from the twofish specification.

Token-pasting reference: http://gcc.gnu.org/onlinedocs/cpp/Concatenation.html#Concatenation