I have been working on essentially a while
loop to go through all CLI arguments. While working on solution to only print 1 element I noticed a few things; this was the thought process that led me to here.
I noticed that if I did lea 16(%rsp), %someRegisterToWrite
, I was able to get/print argv[1]. Next I tried lea 24(%rsp), %someRTW
and this gave me access to argv[2]. I kept going up to see if it would continue to work and it did.
My thought was to keep adding 8 to %someRTW
and increment a "counter" until the counter was equal to argc. This following code works great when a single argument is entered but prints nothing with 2 arguments and when I enter 3 arguments, it will print the first 2 with no space in between.
.section __DATA,__data
.section __TEXT,__text
.globl _main
_main:
lea (%rsp), %rbx #argc
lea 16(%rsp), %rcx #argv[1]
mov $0x2, %r14 #counter
L1:
mov (%rcx), %rsi #%rsi = user_addr_t cbuf
mov (%rcx), %r10
mov 16(%rcx), %r11
sub %r10, %r11 #Get number of bytes until next arg
mov $0x2000004, %eax #4 = write
mov $1, %edi #edi = file descriptor
mov %r11, %rdx #user_size_t nbyte
syscall
cmp (%rbx), %r14 #if counter < argc
jb L2
jge L3
L2:
inc %r14
mov 8(%rcx), %rcx #mov 24(%rsp) back into %rcx
mov $0x2000004, %eax
mov $0x20, %rsi #0x20 = space
mov $2, %rdx
syscall
jmp L1
L3:
xor %rax, %rax
xor %edi, %edi
mov $0x2000001, %eax
syscall
I am going to assume that on 64-bit OS/X you are assembling and linking in such away that you intentionally want to bypass the C runtime code. One example would be to do a static build without the C runtime startup files and the System library, and that you are specifying that _main
is your program entry point. _start
is generally the process entry point unless overridden.
In this scenario the 64-bit kernel will load the macho64 program into memory and set up the process stack with the program arguments, and environment variables among other things. Apple OS/X process stack state at startup is the same as what is documented in the System V x86-64 ABI in Section 3.4:
One observation is that the list of argument pointers is terminated with a NULL(0) address. You can use this to loop through all parameters until you find the NULL(0) address as an alternative to relying on the value in argc
.
One problem is that your code assumes that registers are all preserved across a SYSCALL. The SYSCALL instruction itself will destroy the contents of RCX and R11:
SYSCALL invokes an OS system-call handler at privilege level 0. It does so by loading RIP from the IA32_LSTAR MSR (after saving the address of the instruction following SYSCALL into RCX). (The WRMSR instruction ensures that the IA32_LSTAR MSR always contain a canonical address.)
SYSCALL also saves RFLAGS into R11 and then masks RFLAGS using the IA32_FMASK MSR (MSR address C0000084H); specifically, the processor clears in RFLAGS every bit corresponding to a bit that is set in the IA32_FMASK MSR
One way to avoid this is to try and use registers other than RCX and R11. Otherwise you will have to save/restore them across a SYSCALL if you need their values to be untouched. The kernel will also clobber RAX with a return value.
A list of the Apple OS/X system calls provides the details of all the available kernel functions. In 64-bit OS/X code each of the system call numbers has 0x2000000 added to it:
In 64-bit systems, Mach system calls are positive, but are prefixed with 0x2000000 — which clearly separates and disambiguates them from the POSIX calls, which are prefixed with 0x1000000
Your method to compute the length of a command line argument will not work. The address of one argument doesn't necessarily have to be placed in memory after the previous one. The proper way is to write code that starts at the beginning of the argument you are interested in and searches for a NUL(0) terminating character.
This code to print a space or separator character won't work:
mov 8(%rcx), %rcx #mov 24(%rsp) back into %rcx
mov $0x2000004, %eax
mov $0x20, %rsi #0x20 = space
mov $2, %rdx
syscall
When using the sys_write
system call the RSI register is a pointer to a character buffer. You can't pass an immediate value like 0x20 (space). You need to put the space or some other separator (like a new line) into a buffer and pass that buffer through RSI.
This code takes some of the ideas in the previous information and additional cleanup, and writes each of the command line parameters (excluding the program name) to standard output. Each will be separated by a newline. Newline on Darwin OS/X is 0x0a
(\n
).
# In 64-bit OSX syscall numbers = 0x2000000+(32-bit syscall #)
SYS_EXIT = 0x2000001
SYS_WRITE = 0x2000004
STDOUT = 1
.section __DATA, __const
newline: .ascii "\n"
newline_end: NEWLINE_LEN = newline_end-newline
.section __TEXT, __text
.globl _main
_main:
mov (%rsp), %r8 # 0(%rsp) = # args. This code doesn't use it
# Only save it to R8 as an example.
lea 16(%rsp), %rbx # 8(%rsp)=pointer to prog name
# 16(%rsp)=pointer to 1st parameter
.argloop:
mov (%rbx), %rsi # Get current cmd line parameter pointer
test %rsi, %rsi
jz .exit # If it's zero we are finished
# Compute length of current cmd line parameter
# Starting at the address in RSI (current parameter) search until
# we find a NUL(0) terminating character.
# rdx = length not including terminating NUL character
xor %edx, %edx # RDX = character index = 0
mov %edx, %eax # RAX = terminating character NUL(0) to look for
.strlenloop:
inc %rdx # advance to next character index
cmpb %al, -1(%rsi,%rdx)# Is character at previous char index
# a NUL(0) character?
jne .strlenloop # If it isn't a NUL(0) char then loop again
dec %rdx # We don't want strlen to include NUL(0)
# Display the cmd line argument
# sys_write requires:
# rdi = output device number
# rsi = pointer to string (command line argument)
# rdx = length
#
mov $STDOUT, %edi
mov $SYS_WRITE, %eax
syscall
# display a new line
mov $NEWLINE_LEN, %edx
lea newline(%rip), %rsi # We use RIP addressing for the
# string address
mov $SYS_WRITE, %eax
syscall
add $8, %rbx # Go to next cmd line argument pointer
# In 64-bit pointers are 8 bytes
# lea 8(%rbx), %rbx # This LEA instruction can replace the
# ADD since we don't care about the flags
# rbx = 8 + rbx (flags unaltered)
jmp .argloop
.exit:
# Exit the program
# sys_exit requires:
# rdi = return value
#
xor %edi, %edi
mov $SYS_EXIT, %eax
syscall
If you intend to use code like strlen
in various places then I recommend creating a function that performs that operation. I have hard coded strlen
into the code for simplicity. If you are looking to improve on the efficiency of your strlen
implementation then a good place to start would be Agner Fog's Optimizing subroutines in assembly language.
This code should compile and link to a static executable without C runtime using:
gcc -e _main progargs.s -o progargs -nostartfiles -static