Search code examples
c#compiler-constructionexpressionpostfix-notation

What type of algorithm is used for C# expressions?


Shunting-yard algorithm is used to convert expressions from infix to postfix notation (Reverse Polish notation), so that it is eaiser to evaluate them by a compiler. For example, 2 + 3 * 2 would be converted to 2 3 2 * +. In Wikipedia, it is mentioned that this algorithm is used by many applications including

Any stack-oriented programming language, such as: Forth, Factor, PostScript page description language, Befunge, Joy

I don't see C# or even any popular high-level language. So, does C# use this algorithm for expressions? If no, how does C#-compiler compile and evaluate expressions?


Solution

  • Yes, C# is converted to a stack-oriented programming language -- IL. When the compiler converts an expression to the internal language, it makes a list of operations that follow RPN.

    For example, this method

    int x(int a, int b, int c, int d) {
        return a+b*(c+d);
    }
    

    gets converted to this internal language (see comments for explanation of what is going on):

    IL_0001:  ldarg.0 // Push a on the stack
    IL_0002:  ldarg.1 // Push b on the stack
    IL_0003:  ldarg.2 // Push c on the stack
    IL_0004:  ldarg.3 // Push d on the stack
    IL_0005:  add     // Add d+c, push the result
    IL_0006:  mul     // Multiply (d+c) by b
    IL_0007:  add     // Add b*(d+c)+a
    

    As you can see, the operands are pushed onto stack in the order that makes it convenient to perform operations by working from the back of the expression to its front - exactly the way the article explains it.

    The exact algorithm of the conversion is compiler-dependent (strictly speaking, the end result is compiler-dependent as well, because there are multiple ways to convert an expression to a valid RPN sequence) but the basic idea of RPN is there.