Search code examples
assemblyx86nasmmachine-code

Is it possible to call a relative address with each instruction at most 3 bytes long, in 32-bit mode?


I'm working on an exercise in x86 assembly (using NASM) that has the niche requirement of limiting each instruction to a maximum of 3 bytes.

I'd like to call a label, but the normal way to do this (shown in the code example) always results in an instruction size of 5 bytes. I'm trying to find out if there's a series of instructions, 3 bytes or less each, that can accomplish this.

I've attempted to load the label address into a register and then call that register, but it seems like the address is then interpreted as an absolute address, instead of a relative one.

I looked around to see if there's a way to force call to interpret the address in the register as a relative address, but couldn't find anything. I have thought about simulating a call by pushing a return address to the stack and using jmp rel8, but am unsure how to get the absolute address of where I want to return to.

Here is the normal way to do what I want:

[BITS 32]

call func     ; this results in a 5-byte call rel32 instruction
; series of instructions here that I would like to return to

func:
  ; some operations here
  ret 

I have tried things like this:

[BITS 32]

mov eax, func          ; 5-byte  mov r32, imm32
call eax               ; 2-byte  call r32
          ; this fails, seems to interpret func's relative address as an absolute
 ...   ; series of instructions here that I would like to return to

func:
  ; some operations here
  ret 

I have a feeling there may be a way to do this using some sort of LEA magic, but I'm relatively new to assembly so I couldn't figure it out.

Any tips are appreciated!


Solution

  • There is no such thing as relative indirect near CALL. You will have to find some other mechanism to do the call to the label func. One method I can think of is building the absolute address in a register and doing an absolute indirect call through the register:

    enter image description here

    It is unclear what the target of your code is. This assumes you are generating a 32-bit Linux program. I use a linker script to compute the individual bytes of the target label. Those bytes will be used by the program to build a return address in EAX and then an indirect near call via EAX will be performed. A couple methods of building the address are presented.

    A linker script link.ld that breaks a label's address into individual bytes:

    SECTIONS
    {
      . = 0x8048000;
      func_b0 =  func & 0x000000ff;
      func_b1 = (func & 0x0000ff00) >> 8;
      func_b2 = (func & 0x00ff0000) >> 16;
      func_b3 = (func & 0xff000000) >> 24;
    }
    

    Assembly code file myprog.asm:

    [BITS 32]
    global func
    extern func_b0, func_b1, func_b2, func_b3
    
    _start:
        ; Method 1
        mov al, func_b3            ; EAX = ######b3
        mov ah, func_b2            ; EAX = ####b2b3
        bswap eax                  ; EAX = b3b2####
        mov ah, func_b1            ; EAX = b3b2b1##
        mov al, func_b0            ; EAX = b3b2b1b0
        call eax
    
        ; Method 2
        mov ah, func_b3            ; EAX = ####b3##
        mov al, func_b2            ; EAX = ####b3b2
        shl eax, 16                ; EAX = b3b20000
        mov ah, func_b1            ; EAX = b3b2b100
        mov al, func_b0            ; EAX = b3b2b1b0
        call eax
    
        ; series of instructions here that I would like to return to
        xor eax, eax
        mov ebx, eax               ; EBX = 0 return value
        inc eax                    ; EAX = 1 exit system call
        int 0x80                   ; Do exit system call
    
    func:
        ; some operations here
        ret
    

    Assemble and link with:

    nasm -f elf32 -F dwarf myprog.asm -o myprog.o
    gcc -m32 -nostartfiles -g -Tlink.ld myprog.o -o myprog
    

    If you run objdump -Mintel -Dx the information of interest would look something similar to:

    00000020 g       *ABS*  00000000 func_b0
    00000004 g       *ABS*  00000000 func_b2
    08048020 g       .text  00000000 func
    00000080 g       *ABS*  00000000 func_b1
    00000008 g       *ABS*  00000000 func_b3
    
    ...
    
    08048000 <_start>:
     8048000:       b0 08                   mov    al,0x8
     8048002:       b4 04                   mov    ah,0x4
     8048004:       0f c8                   bswap  eax
     8048006:       b4 80                   mov    ah,0x80
     8048008:       b0 20                   mov    al,0x20
     804800a:       ff d0                   call   eax
     804800c:       b4 08                   mov    ah,0x8
     804800e:       b0 04                   mov    al,0x4
     8048010:       c1 e0 10                shl    eax,0x10
     8048013:       b4 80                   mov    ah,0x80
     8048015:       b0 20                   mov    al,0x20
     8048017:       ff d0                   call   eax
     8048019:       31 c0                   xor    eax,eax
     804801b:       89 c3                   mov    ebx,eax
     804801d:       40                      inc    eax
     804801e:       cd 80                   int    0x80
    
    08048020 <func>:
     8048020:       c3                      ret