I am optimizing our debug print facilities (class).
The class is roughly straightforward, with a global "enabled" bool and a PrineDebug
routine.
I'm investigating the performance of the PrintDebug
method in "disabled" mode, trying to create a framework with less impact on run time if no debug prints are needed.
During the exploration I came across the below results, which were a surprise to me and I wonder what am I missing here?
public class Profiler
{
private bool isDebug = false;
public void PrineDebug(string message)
{
if (isDebug)
{
Console.WriteLine(message);
}
}
}
[MemoryDiagnoser]
public class ProfilerBench
{
private Profiler profiler = new Profiler();
private int five = 5;
private int six = 6;
[Benchmark]
public void DebugPrintConcat()
{
profiler.PrineDebug("sometext_" + five + "_" + six);
}
[Benchmark]
public void DebugPrintInterpolated()
{
profiler.PrineDebug($"sometext_{five}_{six}");
}
}
Running this benchmark under BenchmarkDotNet.. Here are the results:
| Method | Mean | Error | StdDev | Gen 0 | Allocated |
|----------------------- |---------:|--------:|--------:|-------:|----------:|
| DebugPrintConcat | 149.0 ns | 3.02 ns | 6.03 ns | 0.0136 | 72 B |
| DebugPrintInterpolated | 219.4 ns | 4.13 ns | 6.18 ns | 0.0181 | 96 B |
I thought the Concat approach will be slower as every +
operation actually creates a new string (+allocation), but seems the interpolation caused higher allocation with higher time.
Can you explain?
TLDR: Interpolated strings are overall the best and they only allocate more memory in your benchmarks because you are using old .Net and cached number strings
There's a lot to talk about here.
First off, a lot of people think string concatenation using +
will always create a new string for every +
. That might be the case in a loop, but if you use lots of +
one after another, the compiler will actually replace those operators with a call to one string.Concat
, making the complexity O(n), not O(n^2). Your DebugPrintConcat
actually compiles to this:
public void DebugPrintConcat()
{
profiler.PrineDebug(string.Concat("sometext_", five.ToString(), "_", six.ToString()));
}
It should be noted that in your specific case, you are not benchmarking string allocation for the integers because .Net caches string instances for small numbers, so those .ToString()
on five
and six
end up allocating nothing. The memory allocation would've been much different if you used bigger numbers or formatting (like .ToString("10:0000")
).
The three ways of concating strings are +
(that is, string.Concat()
), string.Format()
and interpolated strings. Interpolated strings used to be the exact same as string.Format()
, as $"..."
was just syntactic sugar for string.Format()
, but that is not the case anymore since .Net 6 when they got a redesign via Interpolated String Handlers
Another myth I think I have to address is that people think that using string.Format()
on structs will always lead to first boxing the struct, then creating an intermediate string by calling .ToString()
on the boxed struct. That is false, for years now, all primitive types have implemented ISpanFormattable
which allowed string.Format()
to skip creating an intermediate string and write the string representation of the object directly into the internal buffer. ISpanFormattalbe
has gone public with the release of .Net 6 so you can implement it for your own types, too (more on that at the end of this answer)
About memory characteristics of each approach, ordered from worst to best:
string.Concat()
(the overloads accepting objects, not strings) is the worst because it will always box structs and create intermediate strings (source: decompilation using ILSpy)+
and string.Concat()
(the overloads accepting strings, not objects) are slightly better than the previous, because while they do use intermediate strings, they don't box structsstring.Format()
is generally better than previous because as mentioned earlier it does need to box structs, but not make an intermediate string if the structs implement ISpanFormattable
(which was internal to .Net until not too long ago, but the performance benefit was there nevertheless). Furthermore, it is much more likely string.Format() won't need to allocate an object[]
compared to previous methodsISpanFormattable
. The only allocation you will generally get with them is just the returned string and nothing else.To support the claims above, I'm adding a benchmark class and benchmark results below, making sure to avoid the situation in the original post where +
performs best only because strings are cached for small ints:
[MemoryDiagnoser]
[RankColumn]
public class ProfilerBench
{
private float pi = MathF.PI;
private double e = Math.E;
private int largeInt = 116521345;
[Benchmark(Baseline = true)]
public string StringPlus()
{
return "sometext_" + pi + "_" + e + "_" + largeInt + "...";
}
[Benchmark]
public string StringConcatStrings()
{
// the string[] overload
// the exact same as StringPlus()
return string.Concat("sometext_", pi.ToString(), "_", e.ToString(), "_", largeInt.ToString(), "...");
}
[Benchmark]
public string StringConcatObjects()
{
// the params object[] overload
return string.Concat("sometext_", pi, "_", e, "_", largeInt, "...");
}
[Benchmark]
public string StringFormat()
{
// the (format, object, object, object) overload
// note that the methods above had to allocate an array unlike string.Format()
return string.Format("sometext_{0}_{1}_{2}...", pi, e, largeInt);
}
[Benchmark]
public string InterpolatedString()
{
return $"sometext_{pi}_{e}_{largeInt}...";
}
}
Results are ordered by bytes allocated:
Method | Mean | Error | StdDev | Rank | Gen 0 | Allocated |
---|---|---|---|---|---|---|
StringConcatObjects | 293.9 ns | 1.66 ns | 1.47 ns | 4 | 0.0386 | 488 B |
StringPlus | 266.8 ns | 2.04 ns | 1.91 ns | 2 | 0.0267 | 336 B |
StringConcatStrings | 278.7 ns | 2.14 ns | 1.78 ns | 3 | 0.0267 | 336 B |
StringFormat | 275.7 ns | 1.46 ns | 1.36 ns | 3 | 0.0153 | 192 B |
InterpolatedString | 249.0 ns | 1.44 ns | 1.35 ns | 1 | 0.0095 | 120 B |
If I edit the benchmark class to use more than three format arguments, then the difference between InterpolatedString
and string.Format()
will be even greater because of the array allocation:
[MemoryDiagnoser]
[RankColumn]
public class ProfilerBench
{
private float pi = MathF.PI;
private double e = Math.E;
private int largeInt = 116521345;
private float anotherNumber = 0.123456789f;
[Benchmark]
public string StringPlus()
{
return "sometext_" + pi + "_" + e + "_" + largeInt + "..." + anotherNumber;
}
[Benchmark]
public string StringConcatStrings()
{
// the string[] overload
// the exact same as StringPlus()
return string.Concat("sometext_", pi.ToString(), "_", e.ToString(), "_", largeInt.ToString(), "...", anotherNumber.ToString());
}
[Benchmark]
public string StringConcatObjects()
{
// the params object[] overload
return string.Concat("sometext_", pi, "_", e, "_", largeInt, "...", anotherNumber);
}
[Benchmark]
public string StringFormat()
{
// the (format, object[]) overload
return string.Format("sometext_{0}_{1}_{2}...{3}", pi, e, largeInt, anotherNumber);
}
[Benchmark]
public string InterpolatedString()
{
return $"sometext_{pi}_{e}_{largeInt}...{anotherNumber}";
}
}
Benchmark results, again ordered by bytes allocated:
Method | Mean | Error | StdDev | Rank | Gen 0 | Allocated |
---|---|---|---|---|---|---|
StringConcatObjects | 389.3 ns | 2.65 ns | 2.34 ns | 4 | 0.0477 | 600 B |
StringPlus | 350.7 ns | 1.88 ns | 1.67 ns | 2 | 0.0329 | 416 B |
StringConcatStrings | 374.4 ns | 6.90 ns | 6.46 ns | 3 | 0.0329 | 416 B |
StringFormat | 390.4 ns | 2.01 ns | 1.88 ns | 4 | 0.0234 | 296 B |
InterpolatedString | 332.6 ns | 2.82 ns | 2.35 ns | 1 | 0.0114 | 144 B |
EDIT: People might still think calling .ToString()
on interpolated string handler arguments is a good idea. It is not, the performance will suffer if you do it and Visual Studio even kind of warns you not to do it. This is not something that only applies to .net6, below you can see that even when using string.Format()
, which interpolated string used to be syntactic sugar for, it is still bad to call .ToString()
:
[MemoryDiagnoser]
[RankColumn]
public class ProfilerBench
{
private float pi = MathF.PI;
private double e = Math.E;
private int largeInt = 116521345;
private float anotherNumber = 0.123456789f;
[Benchmark]
public string StringFormatGood()
{
// the (format, object[]) overload with boxing structs
return string.Format("sometext_{0}_{1}_{2}...{3}", pi, e, largeInt, anotherNumber);
}
[Benchmark]
public string StringFormatBad()
{
// the (format, object[]) overload with pre-converting the structs to strings
return string.Format("sometext_{0}_{1}_{2}...{3}",
pi.ToString(),
e.ToString(),
largeInt.ToString(),
anotherNumber.ToString());
}
}
Method | Mean | Error | StdDev | Rank | Gen 0 | Allocated |
---|---|---|---|---|---|---|
StringFormatGood | 389.0 ns | 2.27 ns | 2.12 ns | 1 | 0.0234 | 296 B |
StringFormatBad | 442.0 ns | 4.62 ns | 4.09 ns | 2 | 0.0305 | 384 B |
The explanation for the results is that it is cheaper to box the struct and have string.Format()
write the string representations directly into it's char buffer, rather than creating an intermediate string explicitly and forcing string.Format()
to copy from it.
If you want to read more about how interpolated string handlers work and how to make your own types implement ISpanFormattable
, this is a good reading: link