Search code examples
cassemblyinstructions

How many asm-instructions per C-instruction?


I realize that this question is impossible to answer absolutely, but I'm only after ballpark figures:

Given a reasonably sized C-program (thousands of lines of code), on average, how many ASM-instructions would be generated. In other words, what's a realistic C-to-ASM instruction ratio? Feel free to make assumptions, such as 'with current x86 architectures'.

I tried to Google about this, but I couldn't find anything.

Addendum: noticing how much confusion this question brought, I feel some need for an explanation: What I wanted to know by this answer, is to know, in practical terms, what "3GHz" means. I am fully aware of that the throughput per Herz varies tremendously depending on the architecture, your hardware, caches, bus speeds, and the position of the moon.

I am not after a precise and scientific answer, but rather an empirical answer that could be put into fathomable scales.

This isn't a trivial answer to place (as I became to notice), and this was my best effort at it. I know that the amount of resulting lines of ASM per lines of C varies depending on what you are doing. i++ is not in the same neighborhood as sqrt(23.1) - I know this. Additionally, no matter what ASM I get out of the C, the ASM is interpreted into various sets of microcode within the processor, which, again, depends on whether you are running AMD, Intel or something else, and their respective generations. I'm aware of this aswell.

The ballpark answers I've got so far are what I have been after: A project large enough averages at about 2 lines of x86 ASM per 1 line of ANSI-C. Today's processors probably would average at about one ASM command per clock cycle, once the pipelines are filled, and given a sample big enough.


Solution

  • I'm not sure what you mean by "C-instruction", maybe statement or line? Of course this will vary greatly due to a number of factors but after looking at a few sample programs of my own, many of them are close to the 2-1 mark (2 assembly instructions per LOC), I don't know what this means or how it might be useful.

    You can figure this out yourself for any particular program and implementation combination by asking the compiler to generate only the assembly (gcc -S for example) or by using a disassembler on an already compiled executable (but you would need the source code to compare it to anyway).

    Edit

    Just to expand on this based on your clarification of what you are trying to accomplish (understanding how many lines of code a modern processor can execute in a second):

    While a modern processor may run at 3 billion cycles per second that doesn't mean that it can execute 3 billion instructions per second. Here are some things to consider:

    • Many instructions take multiple cycles to execute (division or floating point operations can take dozens of cycles to execute).
    • Most programs spend the vast majority of their time waiting for things like memory accesses, disk accesses, etc.
    • Many other factors including OS overhead (scheduling, system calls, etc.) are also limiting factors.

    But in general yes, processors are incredibly fast and can accomplish amazing things in a short period of time.