Search code examples
parsingcompiler-constructionx86-64inline-assemblylanguage-design

Inline assembly in compiler design


I am making my own compiler for my own C-like language (x86-64). But I am confused as to how one would compile a snippet of another type of language, namely x86-64 assembly such as:

int main() {
   __asm {
       mov rcx, rsp
       call func
   }
}

as soon as __asm is encountered, it must somehow change tokens to assembly tokens, what if I for instance have a variable outside of the __asm block named rcx? What is a good way to incorporate this in a C-like compiler design? How would you tokenize it and parse it in a way that separates it from the C-like code? The __asm block would be recognized first on a parser level, but you cant reach that level without having tokenized it ....


Solution

  • One option is to do what modern MSVC does and provide intrinsics for every instruction, including privileged ones like invlpg. (Because MSVC doesn't support inline asm for targets other than 32-bit x86). This is how MS is still able to use it to develop the Windows kernel.

    That won't work well if you're not keeping on top of future instruction-set extensions in all target ISAs you care about, though.


    I'd really recommend using GNU C's Extended inline asm syntax where operand constraints describe the asm template string to the compiler. The compiler itself doesn't have to understand it at all, just substitute strings into it like printf looking for %conversion. (See What is the difference between 'asm', '__asm' and '__asm__'?)

    The C var names being accessed are specified using a fixed syntax that doesn't depend on the asm syntax. Also, the asm is inside a "" as a string literal at the C syntax level, so stuff like ARM push {r4, lr} aren't visible to the block scope parsing. See https://stackoverflow.com/tags/inline-assembly/info for more docs / guides to exactly how GNU C inline asm works. Also note that its template / operand-constraint syntax is (almost?) the same as what GCC uses internally in its machine-definition files that teach the compiler out available instructions for different targets.

    That punts the problem to the programmer of writing all the clobber declarations to tell the compiler about every register that a call to an arbitrary function could modify, assuming it follows the standard calling convention.

    That also lets you write stuff like asm("blsi %1, %0" : "=r"(dst) : "r"(src) ) where the compiler chooses which registers to actually use. (output-only register operand, input-only register operand). That lets the compiler do register allocation around the black box (asm statement) as efficiently as possible. It can pick the same register for input and output, or not, as convenient, because the source didn't use an "early clobber" ("=&r"), so it can assume all inputs are read before any outputs are written.

    It's great for wrapping single instructions, but can be used to wrap multiple instructions and access to pointed-to memory, e.g. via a "memory" clobber.


    The MSVC-style syntax you're showing has to parse the block to detect clobbered registers, and mentions of var names. That's much harder.

    Modern clang does support asm{} blocks with a command line option, but it sucks to use efficiently (just like in MSVC); they're not capable of substituting a register for a variable name so inputs / outputs have to get bounced through memory.

    MSVC doesn't support asm blocks for targets other than 32-bit x86, probably because their compiler internals for handling asm{} is such a mess that it's not safe for functions that have register args. That makes it unusable for modern calling conventions. That's not a syntax problem, just a compiler technical-debt problem.

    But unavoidable inefficiency in getting data into / out of an asm{} block is a syntax / design problem. Don't make the same mistake as MSVC. Or if you do want to just let users mention var names, make it clear in your documentation that they can be replaced by registers or memory, to leave that option open if you think you can make it work in your optimizing back-end.