Search code examples
decompiler

Why Decompilers cant produce original code theoretically


I searched the internet but did not find a concrete answer that why decompilers are unable to produce original source code. I dint get a satisfactory answer. Somewhere it was written that it is similar to halting problem but dint tell how. So what is the theoretical and technical limitation of creating a decompiler which is perfect.


Solution

  • It is, quite simply, a many-to-one problem. For example, in C:

    b++;
    

    and

    b+=1;
    

    and

    b = b + 1;
    

    may all get compiled to the same set of operations once the compiler and optimizer are done. It reorders things, drops in-effective operations, and rewrites entire sections of code. By the time it is done, it has no idea what you wrote, just a pretty good idea what you intended to happen, at a raw-CPU (or vCPU) level.

    It is even smart enough to remove variables that aren't needed:

    {
    a=5;
    b=func();
    c=a+b;
    d=func2(c);
    }
    ## gets rewritten as:
    REGISTERA=func()
    REGISTERA+=5
    return(func2(REGISTERA))