Search code examples
c++linuxcompilationexecutablemachine-code

What components of a machine affect the machine code produced given a C++ file input?


I wrote this question What affects generated machine code at each step of the compilation process? and realized that is was much too broad. So I will try to ask each component of it in a different question.

The first question I will ask is, given an arbitrary C++ file what affects the resulting executable binary file it produces? So far I understand each of the following play a role

  • The CPU architecture like x86_64, ARM64, Power PC, Microblaze, ect.
  • The kernel of a machine like Linux kernel v5.18, v5.17, a Windows Kernel version, a Mac kernel version ect.
  • The operating system such as Debian, CentOS, Windows 7, Windows 10, Mac OS X Mountain Lion, Mac OS X Sierra.
    • Not sure what the OS changes on top of the kernel changes.
  • Finally the tools used to compile, assembly and link. Things like GCC, Clang, Visual Studio (VS), GNU assembler, GNU compiler, VS Compiler, VS linker, ect.

So the 2 questions I have from this are

  • Is there some other component that I left out that affects how the final executable looks like?
  • And does the operating system play a role in affecting the final executable machine code? Because I thought it would all be due to the kernel.

Solution

  • The main one I think you're missing is the Application Binary Interface.  Part of the ABI is the calling convention, which determines certain properties of register usage and parameter passing, so these directly affect the generated machine code.

    The kernel has a loader, and that loader works with file formats, like ELF or PE.  These influence the machine code by determining the layout of the process and how the program's code & data are loaded into memory, and how the machine code instructions access data and other code.  Some environments want position independent code, for example, which affects some of the machine code instructions.


    The CPU architecture like x86_64, ARM64, Power PC, Microblaze, ect.

    Yes.  The instruction set architecture defines the available instructions to use, which in turn define the available CPU registers and how they can be used as well as and sizes of things like pointers.

    The kernel of a machine like Linux kernel v5.18, v5.17, a Windows Kernel version, a Mac kernel version ect.

    Not really.  The operating system choice influences the ABI, which is very relevant, though.

    The operating system such as Debian, CentOS, Windows 7, Windows 10, Mac OS X Mountain Lion, Mac OS X Sierra.

    The operating system usually dictates the ABI, which is important.

    the tools used to compile, assembly and link. Things like GCC, Clang, Visual Studio (VS), GNU assembler, GNU compiler, VS Compiler, VS linker, ect.

    Of course, different tools produce some different machine code, sometimes the differences are equivalent, though some tools produce better machine code than others for some inputs.