Search code examples
assemblyx86gnu-assembleratt

what is jmpl instruction in x86?


x86 assembly design has instruction suffix, such as l(long), w(word), b(byte).
So I thought that jmpl to be long jmp

But it worked quite weird when I assemble it:

Test1 jmp: assembly source, and disassembly

main:
  jmp main

eb fe     jmp 0x0804839b <main> 

Test2 jmpl: assembly source, and disassembly

main:
  jmpl main       # added l suffix

ff 25 9b 83 04 08   jmp *0x0804839b

Compared to Test1, Test2 result is unexpected.
I think it should be assembled the same as Test1.


Question:
Is jmpl some different instruction in 8086 design?
(according to here, jmpl in SPARC means jmp link. is it something like this?)

...Or is this just a bug in GNU assembler?


Solution

  • An l operand-size suffix implies an indirect jmp, unlike with calll main which is still a relative near-call. This inconsistency is pure insanity in AT&T syntax design.

    (And since you're using it with an operand like main, it becomes a memory-indirect jump, doing a data load from main and using that as the new EIP value.)

    You never need to use the jmpl mnemonic, you can and should indicate indirect jumps using * on the operand. Like jmp *%eax to set EIP = EAX, or jmp *4(%edi, %ecx, 4) to index a jump table, or jmp *func_pointer. Using jmpl is optional in all of these.

    You could use jmpw *%ax to truncate EIP to a 16-bit value. That assembles to 66 ff e0 jmpw *%ax)


    Compare What is callq instruction? and What is the difference between retq and ret?, that's just the operand-size suffix behaving like you expected it would, same as plain call or plain ret. But jmp is different.


    semi-related: far jmp or call (to a new CS:[ER]IP) in AT&T syntax is ljmp / lcall. These are very different.


    It's also insane that GAS accepts jmpl main as equivalent to jmpl *main. It only warns instead of erroring.

    $ gcc -no-pie -fno-pie -m32 jmp.s 
    jmp.s: Assembler messages:
    jmp.s:3: Warning: indirect jmp without `*'
    

    And then disassembling it to see what we got, with objdump -drwC a.out:

    08049156 <main>:                                          # corresponding source line (added by hand)
     8049156:       ff 25 56 91 04 08       jmp    *0x8049156    # jmpl main
     804915c:       ff 25 56 91 04 08       jmp    *0x8049156    # jmp  *main
     8049162:       ff 25 56 91 04 08       jmp    *0x8049156    # jmpl *main
    
    08049168 <foo>:
     8049168:       e8 fb ff ff ff          call   8049168 <foo> # calll foo
     804916d:       ff 15 68 91 04 08       call   *0x8049168    # calll *foo
     8049173:       ff 15 68 91 04 08       call   *0x8049168    # call  *foo
    

    We get the same thing if we replace l with q in the source, and built without -m32 (using the default -m64). Including the same warning about a missing *. But the disassembly has an explicit jmpq and callq on every instruction. (Except for a relative direct jmp I added, which uses the jmp mnemonic in the disassembly.)

    It's like objdump thinks 32-bit is the default operand-size for jmp/call in both 32 and 64-bit mode, so it wants to always use a q suffix in 64-bit, but leaves it implicit in 32-bit mode. Anyway, that's just disassembly choice between implicit / explicit size suffixes, no weirdness for a programmer writing source code.


    Other AT&T-syntax assemblers:

    • Clang's built-in assembler does reject jmpl main, requiring jmpl *main.

      $ clang -m32 jmp.s
      jmp.s:3:8: error: invalid operand for instruction
        jmpl main
             ^~~~
      

      calll main is the same as call main. call *main and calll *main are both accepted for indirect jumps.

    • YASM's GAS-syntax mode assembles jmpl main to a near relative jmp, like jmp main! So it disagrees with gcc/clang about jmpl implying indirect. (Very few people use YASM in GAS mode; and these days its maintenance hasn't kept up with NASM for new instructions like AVX512. I like YASM's good defaults for long NOPs, but otherwise I'd recommend NASM.)