Search code examples
cgccshared-librariesmasm

How to get MASM assembly working in a Linux .so file


I have somewhat of a unique problem. I have a .dll written using VC++ and MASM. (ml64 14.34.31942)

The C part is simply a wrapper so one function can be called by a C# (.net6) application. There is only one entry point, which passes a pointer to a large state object. The function returns a int32 as a result code.

I'd like to get this application working on Linux. I think the C# application will work, but my problem is the calls to the unmanaged .dll.

If I take the object file that ml64.exe creates, I can wrap it in a C wrapper to set the registers to the calling convention:

core.c

#include <stdint.h>

extern int32_t asm_func(void* state);

int wrapper(void *state)
{
  __asm__ volatile(
     "mov %rdi, %rsi \t\n"
     "call asm_func  \t\n");

  return 0;
}

int32_t fnEmulatorCode(void* state) {
  return wrapper(state);
}

(return code tbd!)

and compile using: gcc -shared -o EmulatorCore.so -fPIC core.c core.obj -Wall -g -m64 (core.obj is the MASM object file)

The resulting .so file is referenced like any other unmanaged dll:

    [DllImport(@"./EmulatorCore/EmulatorCore.so")]
    private static extern int fnEmulatorCode(ref CpuState state);

And the call works! Kinda of. The entry point is found, and the C wrapper changing the registers is all fine.

However, within the assembler code there are tables of function pointers, which on Linux are not being relocated so a call to the table is seg faulting. This works on Windows.

Eg:

lea rax [opcode_00]
lea rax, qword ptr [rax + rbx * 8]  ; rbx is the instruction
jmp qword ptr [rax]                 ; was original jmp qword ptr [rax + rbx * 8]

...

opcode_00   qword   x00_brk         ; $00
opcode_01   qword   x01_ora_indx    ; $01
opcode_02   qword   noinstruction   ; $02
...

All other jumps and references seem to be fine.

And suggestions on how to proceed? The object file generated by MASM is pe-x86-64 according to objdump. However objcopy doesn't seem to be able to convert this to a elf64-x86-64.

So why would Linux not 'relocate' these tables? Is it through some gcc optimisation? Or maybe dotnet isn't loading the library correctly? Or just that the pe-coff file doesn't expose them properly? If so, how do I work around?

I can't change the original code from MASM, but is there something that can convert it to NASM or similar so I can cross compile? (Not that Googling finds anything..)

I could probably change the table to a proper 'jump table', but that would hurt performance and there could be other issues, so in the first instance I'd rather try to keep the ASM source as is...

Edit:

using nm -gC EmulatorCore.so | grep opcode_00 returns nothing, which implies the references aren't included in the .so file, so cannot be relocated properly? (The procs in the table do appear, which is why the rest of the app appears to work)

Edit2:

To illustrate the issue, compare these two images. One from Windows which shows the table in the range of where the library is in memory, in Linux the values are all very low, indicating the are an offset from entry point of the library.

Table in Memory in Windows (RIP: 0x00007FFE043C5817) enter image description here

Table in memory in Linux (RIP: 0x00007f24fc59a5a8) Linux Memory Dump

Showing the RIP after the jump from the table, showing it is reading the values correctly, its just the values are wrong. enter image description here


Solution

  • On Windows the jump tables are recalculated properly, however on Linux they are not.

    In order to get this working the tables needed to be converted to relative jumps and called like this:

    lea rax [start]
    add rax, qword ptr [rax + rbx * 8]  ; rbx is the instruction
    jmp qword ptr [rax]                 ; was original jmp qword ptr [rax + rbx * 8]
    
    ...
    
    start:
    opcode_00   qword   x00_brk         - start ; $00
    opcode_01   qword   x01_ora_indx    - start ; $01
    opcode_02   qword   noinstruction   - start ; $02
    ...