I am writing a compiler in attempt to switch my programming language from interpreted to compiled
this is the code my script generated:
section .bss
digitSpace resb 100
digitSpacePos resb 8
string_at_index_0 resb 12
string_at_index_0_len resb 4
section .data
section .text
global _start
_start:
mov rax, "Hello world"
mov [string_at_index_0], rax
mov byte [string_at_index_0_len], 13
mov rax, 1
mov rdi, 1
mov rsi, string_at_index_0
mov rdx, string_at_index_0_len
syscall
mov rax, 60
mov rdi, 0
syscall
when i run this code with nasm -f elf64 -o test.o test.asm
i get this warning:
warning:character constant too long [-w+other]
can anyone help me with this , and also if anyone could suggest a better way to output a Hello world that would be helpful too!
mov rax, "Hello world"
RAX is an 64-bit (8 byte) register, you are trying to put 11 bytes into it.
If you want to store immediate data to memory, you can mov rax, imm64
to put 8 bytes into RAX and then push it or store it. Or you can push "hi!"
as a 32-bit immediate if you want.
Here is a simple hello world:
As can be seen you don't want to put the message inside the register, you want to put a pointer to the message into rsi. Since the message is constant, you might as well start with it in a data section instead of an immediate, so you don't have to run instructions at run-time to store it.
section .data ; or .rodata
msg: db "Hello World", 10 ; including a `\n` newline
.len equ $ - msg ; assemble-time constant
; equivalent to
; msg.len equ 12 ; because the distance between here and the start of msg is 12 bytes.
section .text
global _start
_start:
mov rax, 1 ; write call number, __NR_write from asm/unistd_64.h
mov edi, 1 ; to stdout
mov rsi, msg ; pointer to message
mov rdx, msg.len ; length of the message that we defined earlier
syscall ; write(1, "Hello World\n", 12)
mov eax, 60 ; __NR_exit
xor edi, edi
syscall ; _exit(0)
Ideally, your compiler should place string literals in the .rodata
section (read-only data) and pass pointers to them when using them in functions.
See also How to load address of function or label into register - mov rsi, msg
is the least efficient way to do that, despite being the most "obviously" and simple.