Search code examples
macosassemblyx86-64nasmitoa

How to print signed integer in x86 assembly (NASM) on Mac


I found an implementation of unsigned integer conversion in x86 assembly, and I tried plugging it in but being new to assembly and not having a debugging env there yet, it's difficult to understand why it's not working. I would also like it to work with signed integers so it can capture error messages from syscalls.

Wondering if one could show how to fix this code to get the signed integer to print, without using printf but using strprn provided by this answer.

%define a rdi
%define b rsi
%define c rdx
%define d r10
%define e r8
%define f r9
%define i rax

%define EXIT 0x2000001
%define EXIT_STATUS 0

%define READ 0x2000003 ; read
%define WRITE 0x2000004 ; write
%define OPEN 0x2000005 ; open(path, oflag)
%define CLOSE 0x2000006 ; CLOSE
%define MMAP 0x2000197 ; mmap(void *addr, size_t len, int prot, int flags, int fildes, off_t offset)

; szstr computes the lenght of a string.
; rdi - string address
; rdx - contains string length (returned)
strsz:
  xor     rcx, rcx                ; zero rcx
  not     rcx                     ; set rcx = -1 (uses bitwise id: ~x = -x-1)
  xor     al,al                   ; zero the al register (initialize to NUL)
  cld                             ; clear the direction flag
  repnz scasb                     ; get the string length (dec rcx through NUL)
  not     rcx                     ; rev all bits of negative -> absolute value
  dec     rcx                     ; -1 to skip the null-term, rcx contains length
  mov     rdx, rcx                ; size returned in rdx, ready to call write
  ret

; strprn writes a string to the file descriptor.
; rdi - string address
; rdx - contains string length
strprn:
  push    rdi                     ; push string address onto stack
  call    strsz                   ; call strsz to get length
  pop     rsi                     ; pop string to rsi (source index)
  mov     rax, WRITE              ; put write/stdout number in rax (both 1)
  mov     rdi, 1        ; set destination index to rax (stdout)
  syscall                         ; call kernel
  ret

; mov ebx, 0xCCCCCCCD
itoa:
  xor rdi, rdi
  call itoal
  ret

; itoa loop
itoal:
  mov ecx, eax                    ; save original number

  mul ebx                         ; divide by 10 using agner fog's 'magic number'
  shr edx, 3                      ;

  mov eax, edx                    ; store quotient for next loop

  lea edx, [edx*4 + edx]          ; multiply by 10
  shl rdi, 8                      ; make room for byte
  lea edx, [edx*2 - '0']          ; finish *10 and convert to ascii
  sub ecx, edx                    ; subtract from original number to get remainder

  lea rdi, [rdi + rcx]            ; store next byte

  test eax, eax
  jnz itoal

exit:
  mov a, EXIT_STATUS ; exit status
  mov i, EXIT ; exit
  syscall

_main:
  mov rdi, msg
  call strprn
  mov ebx, -0xCCCCCCCD
  call itoa
  call strprn
  jmp exit

section .text
msg: db  0xa, "  Hello StackOverflow!!!", 0xa, 0xa, 0

With this working it will be possible to properly print signed integers to STDOUT, so you can log the registers values.


Solution

  • My answer on How do I print an integer in Assembly Level Programming without printf from the c library? which you already linked shows that serializing an integer into memory as ASCII decimal gives you a length, so you have no use for (a custom version of) strlen here.

    (Your msg has an assemble-time constant length, so it's silly not to use that.)

    To print a signed integer, implement this logic:

    if (x < 0) {
        print('-');   // or just was_negative = 1
        x = -x;
    }
    unsigned_intprint(x);
    

    Unsigned covers the abs(most_negative_integer) case, e.g. in 8-bit - (-128) overflows to -128 signed. But if you treat the result of that conditional neg as unsigned, it's correct with no overflow for all inputs.

    Instead of actually printing a - by itself, just save the fact that the starting number was negative and stick the - in front of the other digits after generating the last one. For bases that aren't powers of 2, the normal algorithm can only generate digits in reverse order of printing,

    My x86-64 print integer with syscall answer treats the input as unsigned, so you should simply use that with some sign-handling code around it. It was written for Linux, but replacing the write system call number will make it work on Mac. They have the same calling convention and ABI.


    And BTW, xor al,al is strictly worse than xor eax,eax unless you specifically want to preserve the upper 7 bytes of RAX. Only xor-zeroing of full registers is handled efficiently as a zeroing idiom.

    Also, repnz scasb is not fast; about 1 compare per clock for large strings.

    For strings up to 16 bytes, you can use a single XMM vector with pcmpeqb / pmovmskb / bsf to find the first zero byte, with no loop. (SSE2 is baseline for x86-64).