Search code examples
c#.netstringclr

System.String does not overload operator += But String Concatenation works, How?


The System.String has only two Operator overloaded

public static bool operator ==(string a, string b)
{
  return string.Equals(a, b);
}

public static bool operator !=(string a, string b)
{
  return !string.Equals(a, b);
}

But when using += for String concat, Example :

    private static void Main()
    {
        String str = "Hello ";
        str += "World";

        Console.WriteLine(str);
    }

it works just fine,

So, how come if System.String doesn't overload the operator += it Concats the string?


Solution

  • First, the operator += can't be overloaded. If you have the expression A += B, it's compiled as if you wrote:*

    A = A + B
    

    Okay, that's why string doesn't overload operator += (because it can't be overloaded). So, why doesn't it overload operator + either? It's just one more difference between the CLR and C#. The C# compiler knows that types like string and int are special, and it generates special code for their operators (calling string.Concat() for string, or the add instruction for int).

    Why are these operators treated in a special way? Because you want them treated in a special way. I think this is most clear for int:

    1. You don't want each int addition to be compiled as a method call, that would add a lot of overhead. Because of that, special instruction for int addition is used.
    2. Integer addition doesn't always behave the same with regards to overflows. There is a compiler switch to throw exceptions for overflows and you can also use the checked and unchecked operators. How should the compiler deal with that if it had only operator +? (What it actually does is to use the instruction add for unchecked overflows and add.ovf for checked overflows.)

    And you want to treat string addition in a special way too, for performance reasons. For example, if you have strings a, b and c and write a + b + c and then you compiled that as two calls to operator +, you would need to allocate a temporary string for the result of a + b, which is inefficient. Instead, the compiler generates that code as string.Concat(a, b, c), which can directly allocate only one string of the required length.


    * This is not exactly right, for details, see Eric Lippert's article Compound Assignment, Part One and Compound assignment in the C# specification. Also note the missing semicolons, A += B really is an expression, for example, you can write X += Y += Z;.