Search code examples
c#.netstringstring-interpolation

Avoidable boxing in string interpolation


Using string interpolation makes my string format looks much more clear, however I have to add .ToString() calls if my data is a value type.

class Person
{
    public string Name { get; set; }
    public int Age { get; set; }
}

var person = new Person { Name = "Tom", Age = 10 };
var displayText = $"Name: {person.Name}, Age: {person.Age.ToString()}";

The .ToString() makes the format longer and uglier. I tried to get rid of it, but string.Format is a built-in static method and I can't inject it. Do you have any ideas about this? And since string interpolation is a syntax sugar of string.Format, why don't they add .ToString() calls when generating the code behind the syntax sugar? I think it's doable.


Solution

  • With the recent release of C# 10 and .NET 6 things changed. The compiler has been optimised to handle interpolated strings better. What I write here I took from a post by Stephen Toub.

    In essence: earlier versions of .NET translated an interpolated string $"{major}.{minor}.{build}.{revision}" (with the variables all being integers) into something like this

    var array = new object[4];
    array[0] = major;
    array[1] = minor;
    array[2] = build;
    array[3] = revision;
    string.Format("{0}.{1}.{2}.{3}", array);
    

    Since C#10 the compiler can use "interpolated string handlers". The above string can be translated into:

    var handler = new DefaultInterpolatedStringHandler(literalLength: 3, formattedCount: 4);
    handler.AppendFormatted(major);
    handler.AppendLiteral(".");
    handler.AppendFormatted(minor);
    handler.AppendLiteral(".");
    handler.AppendFormatted(build);
    handler.AppendLiteral(".");
    handler.AppendFormatted(revision);
    return handler.ToStringAndClear();
    

    According to the author that does not only remove the need for boxing, it introduces further enhancements. As a result the new approach achieves "40% throughput improvement and an almost 5x reduction in memory allocation" (Stephen Toub).

    And this is only the tip of the iceberg. Besides interpolated string handlers further optimisations in the interpretation of interpolated strings were introduced to C#10.

    So I stopped worrying too much about performance and focus on clarity and readability instead. Chances are that we spend a lot of time trying to outsmart the compiler at the cost of code clarity without achieving anything substantial. I'd say, unless we have serious performance issues, we can use interpolated strings without taking complicated detours.

    Limitation. Situation may be different when we translate strings. Then our format string becomes a dynamic resource, right? Now the compiler is unable to interpret the string before runtime and we're back to string.Format again.