Search code examples
linuxassemblyx86-64nasmcompiler-development

x86_64 Warning : character constant too long [-w+other] (nasm)


I am writing a compiler in attempt to switch my programming language from interpreted to compiled

this is the code my script generated:

section .bss
  digitSpace resb 100
  digitSpacePos resb 8
    string_at_index_0 resb 12
    string_at_index_0_len resb 4

section .data

section .text
    global _start
_start:
   mov rax, "Hello world"
   mov [string_at_index_0], rax
   mov byte [string_at_index_0_len], 13

   mov rax, 1
   mov rdi, 1
   mov rsi, string_at_index_0
   mov rdx, string_at_index_0_len
   syscall

   mov rax, 60
   mov rdi, 0
   syscall

when i run this code with nasm -f elf64 -o test.o test.asm i get this warning:

warning:character constant too long [-w+other]

can anyone help me with this , and also if anyone could suggest a better way to output a Hello world that would be helpful too!


Solution

  • mov rax, "Hello world"
    

    RAX is an 64-bit (8 byte) register, you are trying to put 11 bytes into it.

    If you want to store immediate data to memory, you can mov rax, imm64 to put 8 bytes into RAX and then push it or store it. Or you can push "hi!" as a 32-bit immediate if you want.

    Here is a simple hello world:

    As can be seen you don't want to put the message inside the register, you want to put a pointer to the message into rsi. Since the message is constant, you might as well start with it in a data section instead of an immediate, so you don't have to run instructions at run-time to store it.

    section .data                 ; or .rodata
    
    msg: db "Hello World", 10     ; including a `\n` newline
    .len equ $ - msg              ; assemble-time constant
    
    ; equivalent to
    ; msg.len equ 12        ; because the distance between here and the start of msg is 12 bytes.
    
    section .text
        global _start
    _start:
    
       mov rax, 1       ; write call number, __NR_write from asm/unistd_64.h
       mov edi, 1       ; to stdout
       mov rsi, msg     ; pointer to message
       mov rdx, msg.len ; length of the message that we defined earlier
       syscall          ; write(1, "Hello World\n", 12)
    
       mov  eax, 60         ; __NR_exit
       xor  edi, edi
       syscall              ; _exit(0)
    

    Ideally, your compiler should place string literals in the .rodata section (read-only data) and pass pointers to them when using them in functions.

    See also How to load address of function or label into register - mov rsi, msg is the least efficient way to do that, despite being the most "obviously" and simple.