I'm trying to write a just-in-time compiler and I have a piece of code that just doesn't want to work. My platform is x86-64 ubuntu.
I have the following code written in yasm:
bits 64
mov rdx, 1
mov rcx, 'A'
mov rbx, 1
mov rax, 4
int 0x80
ret
So if I understand correctly this should write A
to stdout. Now I compile this code with
yasm -f bin test.yasm
This resulted in the following machine code:
0x48 0xc7 0xc2 0x01 0x00 0x00 0x00 0x48 0xc7 0xc1 0x41 0x00
0x00 0x00 0x48 0xc7 0xc3 0x01 0x00 0x00 0x00 0x48 0xc7 0xc0
0x04 0x00 0x00 0x00 0xcd 0x80 0xc3
and then I read the resulting code in C++ and call it:
void *memory = allocate_executable_memory(sizeof(code));
emit_code_into_memory(sizeof(code), code, memory);
JittedFunc func = reinterpret_cast<JittedFunc>(memory);
func();
I think that the C++ part is fine since I've already tried it with simple arithmetic operations and it worked well.
So anyway there is no segmentation fault, the code seems to be executed but nothing happens, there's nothing in stdout.
Any advice?
//EDIT: full C++ code:
#include <stdio.h>
#include <string.h>
#include <sstream>
#include <iostream>
#include <iomanip>
#include <sys/mman.h>
void* allocate_executable_memory(size_t size) {
void *ptr = mmap(
0,
size,
PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_PRIVATE | MAP_ANONYMOUS,
-1,
0
);
if (ptr == (void*)(-1)) {
perror("mmap");
return nullptr;
}
return ptr;
};
void emit_code_into_memory(size_t code_length, uint8_t *code, void *memory) {
memcpy(reinterpret_cast<uint8_t*>(memory), code, code_length);
};
typedef void (*JittedFunc)();
int main(int argc, char* argv[]) {
/* Use like this:
bin/jit 0xb8 0x11 0x00 0x00 0x00 0xc3
*/
if (argc <= 1) {
return 1;
}
uint8_t code[argc-1];
for (int i = 1; i < argc; i++) {
code[i-1] = std::stoul(argv[i], nullptr, 16);
}
void *memory = allocate_executable_memory(sizeof(code));
emit_code_into_memory(sizeof(code), code, memory);
JittedFunc func = reinterpret_cast<JittedFunc>(memory);
func();
return 0;
};
The write syscall expects a pointer to the thing to write, not an immediate. Also, 64 bit uses syscall
instruction with a different calling convention. This is important for pointers which would otherwise be truncated to 32 bits. Furthermore, also the function numbers are different, so your code actually invokes the stat
syscall as can be seen using strace
:
stat(NULL, NULL) = -1 EFAULT (Bad address)
You should try the following code instead:
push 'A'
mov rdi, 1 ; stdout
mov rsi, rsp ; buf
mov rdx, 1 ; count
mov rax, 1 ; sys_write
syscall
pop rdi ; cleanup
ret
This uses the stack to store the letter to print. The cleanup could use any caller-saved scratch register, or could be rewritten as add rsp, 8
. The return value from the system call is in eax
.
The 32 bit version could look like:
push ebx ; callee-saved
push 'A'
mov ebx, 1 ; stdout
mov ecx, esp ; buf
mov edx, 1 ; count
mov eax, 4 ; sys_write
int 0x80
pop edi ; cleanup buf
pop ebx ; restore ebx
ret
Notice that ebx
has to be preserved according to calling convention.