Search code examples
executablecpu-architectureinstructionsself-modifyingplatform-agnostic

Are there any smart cases of runtime code modification?


Can you think of any legitimate (smart) uses for runtime code modification (program modifying it's own code at runtime)?

Modern operating systems seem to frown upon programs that do this since this technique has been used by viruses to avoid detection.

All I can think of is some kind of runtime optimization that would remove or add some code by knowing something at runtime which cannot be known at compile time.


Solution

  • There are many valid cases for code modification. Generating code at run time can be useful for:

    • Some virtual machines use JIT compilation to improve performance.
    • Generating specialized functions on the fly has long been common in computer graphics. See e.g. Rob Pike and Bart Locanthi and John Reiser Hardware Software Tradeoffs for Bitmap Graphics on the Blit (1984) or this posting (2006) by Chris Lattner on Apple's use of LLVM for runtime code specialization in their OpenGL stack.
    • In some cases software resorts to a technique known as trampoline which involves the dynamic creation of code on the stack (or another place). Examples are GCC's nested functions and the signal mechanism of some Unices.

    Sometimes code is translated into code at runtime (this is called dynamic binary translation):

    • Emulators like Apple's Rosetta use this technique to speed up emulation. Another example is Transmeta's code morphing software.
    • Sophisticated debuggers and profilers like Valgrind or Pin use it to instrument your code while it is being executed.
    • Before extensions were made to the x86 instruction set, virtualization software like VMWare could not directly run privileged x86 code inside virtual machines. Instead it had to translate any problematic instructions on the fly into more appropriate custom code.

    Code modification can be used to work around limitations of the instruction set:

    • There was a time (long ago, I know), when computers had no instructions to return from a subroutine or to indirectly address memory. Self modifying code was the only way to implement subroutines, pointers and arrays.

    More cases of code modification:

    • Many debuggers replace instructions to implement breakpoints.
    • Some dynamic linkers modify code at runtime. This article provides some background on the runtime relocation of Windows DLLs, which is effectively a form of code modification.