Search code examples
assemblyx86segmentation-fault

Why does accessing a value held by a 16-bit register result in a Seg Fault, while doing the same operation on a 32-bit register works fine?


I am having issues with using 16-bit registers for a university assignment. The code I am trying to run is as follows:

myArray db 1,2,3,4; declared in the correct section
;-----------------
mov si, myArray
mov ax, [si] ; segfault here
sub ax, [si+1]
mov [si+3], ax

and

mov dx, 1234h
push dx
mov dx, 0ABCDh
push dx
mov bp, sp
mov ax, ss:[bp]

Both resulting in a segmentation fault. However, when I run the same code using ESI and RBP registers, it assembles without issues.

I am running it on a VM with Ubuntu 22.04.3. I have heard from a friend (not sure how reliable this is) that x86 64-bit might be blocking direct access to some 16-bit registers and that is why I'm having issues. I brought it up with my professor, but they dismissed me stating it was probably an error in my syntax.

I have tried searching for a solution but was unsuccessful. I originally was using WSL and, thinking it was the issue, switched to a VirtualBox VM, but to no avail.


Solution

  • mov eax, 1
    mov ebx, 0
    int 0x80 
    

    From a comment it has been established that you are writing a 32-bit program.
    It is fine to use 16-bit registers to hold 16-bit values, but not to hold addresses! Your mov si, myArray and mov bp, sp are loading an incomplete address. You get the full address with mov esi, myArray and mov ebp, esp.
    For performance reasons, you should not only load the 16-bit register but the whole 32-bit register. Use movzx (MOVe with Zero Extension) for this.

    Watch out for buffer overflow

    Both mov ax, [si] and sub ax, [si+1] address two byte-sized array elements at once, but mov [si+3], ax is also writing to the byte that comes after the array! Which is possibly another variable and that gets corrupted this way. It is not clear what this first snippet is supposed to achieve, but a rewrite would look like:

    mov   esi, myArray
    movzx eax, word [esi]  ; -> AX = 0201h  (EAX = 00000201h)
    sub   ax, [esi+1]      ; -> AX = FEFFh  (EAX = 0000FEFFh)
    mov   ???
    

    Watch out for redundancies

    In mov ax, ss:[bp] the mention of BP as the base address already makes the CPU use the SS segment register. It is the default. No need to explicitly add the segment override ss: which, depending on the assembler, could consume an extra byte.
    It gets even better since you don't need to use EBP at all. You can address the stack via the ESP register.
    The equivalent code for your second snippet becomes:

    push  1234ABCDh
    movzx eax, word [esp]  ; -> AX = 0ABCDh  (EAX = 0000ABCDh)