language-agnostic theory interpreted-language

How can be interpreted code even little efficient? (theoretical)

OK, first, I dont want any kind of flamewar here or anything like it. My bigger question is more theoretical, and will include few examples.

So, as I wrote, I cannot understand how can interpreted language be even little efficient. And since its modern, I will take Java as example.

Lets go back to days where there was no JIT compilers. Java has its virtual machine which is basically its hardware. You write code, than it compiled into bytecode for taking at least some job off the virtual machine, thats fine. But considering how complex even RISC instruction set can be in hardware, I cannot even think of way to do it at software emulated hardware.

I have no experience writing virtual machines, so I dont know how its done at most efficient level, but I cannot think of anything more efifcient than testing every instruction for match adn than do appropriate actions. You know, something like: if(instruction=="something") { (do it) } else if(instruction=="something_diffrent"){ (do it) }etc....

But this has to be terribly slow. And still, even there are articles that java was slow before JIT compilers, they still say that its not so slow. But for emulating it must take many clock cycles of real HW to perform one bytecode instruction.

And still, even entire platforms are based on java. For example, Android. And first verisons of Android had no JIT compiler. They were interpreted. But should not than be Android terribly slow? And yet it is not. I know, when you call some API function, from Android library, they are written in machine code, so they are efficient, so this helps a lot.

But imagine that you would write your own game engine from sratch, using API just for displaying images. You would need to do many array copy operations, many calculations which would be terribly slow when emulated.

And now some examples as I promised. Since I mainly work with MCUs, I found JVM for Atmel AVR MCU. Thay state that 8MHZ MCU can do 20K java optcodes per second. But since AVR can do most instructions in one or two cycles, lets say 6000000 instructions average. This gives us that JVM without JIT compiler is 300 times slower to machine code. So why become java so popular without JIT compiler? Isnt this too bad performance loss? I just cannot understand it. Thanks.

Solution

We've had byte code around for a long time. On the old Apple II, the USCD p-system was very popular, which compiled Pascal into byte code, which would be interpreted by an 8-bit 6502 that might be running at 2 MHz. Those programs did run reasonably fast.

A bytecode interpreter would generally be based on a jump table rather than a chain of if/then/else statements. In C or C++, this would involve a switch statement. Fundamentally, the interpreter would have the equivalent of an array of processing code, and use the opcode in the byte code instruction as the index of the array.

It's also possible to have byte code that's higher-level than the machine instructions, so that one byte code instruction would translate into several, sometimes numerous, machine code instructions. A byte code that was constructed for a particular language can do this fairly easily, as it only has to match the control and data structures of that particular language. This stretches out the interpretation overhead and makes the interpreter more efficient.

An interpreted language is likely to have some speed penalty when compared to a compiled language, but this is often unimportant. Many programs process input and output at human speed, and that leaves a tremendous amount of performance that can be wasted. Even a network-bound program is likely to have far more CPU power available than it needs. There are programs that can use all the CPU efficiency they can get, and for obvious reasons they tend not to be written in interpreted languages.

And, of course, there's the question of what you get for some inefficiency that may or may not make a difference. Interpreted language implementations tend to be easier to port than compiled implementations, and the actual byte code is often portable. It can be easier to put higher-level functionality in the language. It allows the compilation step to be much shorter, meaning that execution can start much faster. It may allow better diagnostics if something goes wrong.