Search code examples
gcccompiler-constructionx86cpu-architectureinstruction-set

change instruction set in GCC


I want to test some architecture changes on an already existing architecture (x86) using simulators. However to properly test them and run benchmarks, I might have to make some changes to the instruction set, Is there a way to add these changes to GCC or any other compiler?


Solution

  • Simple solution:

    One common approach is to add inline assembly, and encode the instruction bytes directly.

    For example:

    int main()
    {
        asm __volatile__ (".byte 0x90\n");
        return 0;
    }
    

    compiles (gcc -O3) into:

    00000000004005a0 <main>:
      4005a0:       90                      nop
      4005a1:       31 c0                   xor    %eax,%eax
      4005a3:       c3                      retq
    

    So just replace 0x90 with your inst bytes. Of course you wont see the actual instruction on a regular objdump, and the program would likely not run on your system (unless you use one of the nop combinations), but the simulator should recognize it if it's properly implemented there.

    Note that you can't expect the compiler to optimize well for you when it doesn't know this instruction, and you should take care and work with inline assembly clobber/input/output options if it changes state (registers, memory), to ensure correctness. Use optimizations only if you must.


    Complicated solution

    The alternative approach is to implement this in your compiler - it can be done in gcc, but as stated in the comments LLVM is probably one of the best ones to play with, as it's designed as a compiler development platform, but it's still very complicated as LLVM is best suited for IR optimization stages, and is somewhat less friendly when trying to modify the target-specific backends.
    Still, it's doable, and you have to do that if you also plan to have your compiler decide when to issue this instruction. I'd suggest to start from the first option though, to see if your simulator even works with this addition, and only then spending time on the compiler side.

    If and when you do decide to implement this in LLVM, your best bet is to define it as an intrinsic function, there's relatively more documentation about this in here - http://llvm.org/docs/ExtendingLLVM.html