Search code examples
assemblysystem-callsgnu-assembleratt

Porting from Windows to Linux. Assembler command translation


I have recently started learning porting from Windows to Linux. I've been translating program from Intel syntax to AT&T syntax also converting it from x32 to x64. And since I'm new enough to assembler and especially AT&T I've faced some troubles while porting. Just to mention: I'm intentionally not using .intel_syntax directive.

So I got stucked with translating these commands:

RTLWriteIntegerBuffer: TIMES 3 DB 0x90,0x8D,0x40,0x00

followed by:

LEA EDI,[OFFSET RTLWriteIntegerBuffer+ECX-1]

Another one:

LEA EBX,[EDX+'0']

One more:

ReadCharInited: DB 0
CMP BYTE PTR ReadCharInited,0

Another question is: Is there 1:1 mapping between AT&T syntax and Intel syntax? Or are there specific Intel commands that are not supported in AT&T?

And maybe someone knows about functions like this:

HEAP_NO_SERIALIZE=1
HEAP_GENERATE_EXCEPTIONS=4
HEAP_ZERO_MEMORY=8
...
INVOKE HeapAlloc,EAX,HEAP_GENERATE_EXCEPTIONS+HEAP_ZERO_MEMORY+HEAP_CREATE_ALIGN_16,4194332

This one is probably Borland Turbo Assembler-specific way to call kernel32.dll's HeapAlloc, but I'm not sure. Can it be translated to fallocate syscall?

Thanks in advance


Solution

  • When talking about "AT&T syntax" versus "Intel syntax", it normally only refers to the difference between instruction mnemonics and operand ordering and format.

    So, for example, this is an instruction in AT&T syntax:

    movl $1, (%esi)
    

    and this is the same instruction using Intel syntax:

    mov  DWORD PTR [esi], 1
    

    For every instruction representable in Intel syntax, there's an equivalent representation in AT&T syntax for that instruction.

    Since there's no AT&T assembler and no Intel assembler any more, the directives (everything other than the instructions) are a different matter. The GNU assembler (GAS) supports AT&T and Intel syntax, but only its own directives, which are an extension of the directives used by the AT&T assembler. Microsoft's MASM supports only Intel syntax but also only its own directives, which are an extension of the original Intel assembler's. There isn't always a direct equivalent from one assembler's directives to another assembler's. In some cases the fact that they use different object file formats may prevent finding any way of implementing the functionality of a directive in a different assembler using a different object file format. (Or even the same assembler using a different format, as can be the case with the GNU assembler.)

    As an example, here's some GAS directives:

    .rept 3
    .byte 0x90, 0x8D, 0x40, 0x00
    .endr
    

    And here are the equivalent MASM directives:

    REPT 3
    DB 90h, 8Dh, 40h, 00h
    ENDM
    

    But there's no MASM equivalent of the following GAS directive, because it's specific to the ELF object format, which MASM doesn't support:

    .protected foo
    

    On the other hand there's no direct equivalent to the following MASM directive, because GAS doesn't support any complex high level language directives:

    INVOKE HeapAlloc,EAX,HEAP_GENERATE_EXCEPTIONS+HEAP_ZERO_MEMORY+HEAP_CREATE_ALIGN_16,4194332
    

    To port the former ELF-specific directive you'd have to redesign the application to deal with how Windows handles shared libraries. To port the later MASM-specific directive you'd either have to create your own macro that did the work of figuring out how to pass the all the arguments correctly, or just manually write out all the assembly instructions necessary for this call according to the Linux x86-64 ABI. (You'd also have to find an appropriate Linux function to call and pass a different set of arguments, but that's a separate issue from translating the directive itself.)

    Some assemblers try to be compatible with other assemblers; for example Borland's TASM tries to be compatible with MASM, although it's an older version of MASM. So what works in TASM (in its default MASM mode) will usually work in MASM and vice versa. Many assemblers, however, use essentially their own version of x86 assembly language.

    For example, the code you've shown in your post seems to be using two different assembly language versions and can't be assembled by any single assembler. Your first line of code uses the TIMES directive, but this directive is only supported by NASM, which doesn't use AT&T syntax nor Intel syntax. It has its own instruction syntax, although it's not that different from Intel syntax. It also has its own incompatible set of directives, not based on anything in particular, like that TIMES directive you showed.

    The rest of your code appears to be in MASM syntax. Except for the third line, it wouldn't assemble correctly with NASM (nor would the first line assemble correctly with MASM). I'm not sure if would assemble with TASM either, since the INVOKE directive was added in MASM 6.

    Note that, given the nature of your code, it probably gains nothing by being written in assembly language and you might be far better off translating it into C, C++, or some other language you're more familiar with.