Back in the days, I was always told to use AddRange
whenever possible because of the performance gain over Add
. Today, I wanted to validate this statement prior to using it with the output of a List<T>.Select(x => ...)
call but from the decompiled code of List<T>
, I am under the impression the foreach {add}
loop should be faster.
The main reasons I see are the numerous additional checks that are done in the process since it is not an ICollection
_size
ICollection
ICollection
(which it isn't after the select
call)Insert
Add
)Insert
can also put data prior or within the actual list)Has anyone ever done reliable benchmarking on this?
Edit: Added Code samples of the two options
var clientsConfirmations = new List<ClientConfirmation>();
foreach (var client in Clients)
{
var clientConf = new ClientConfirmation(client.Id, client.Name,
client.HasFlag ?? false);
clientsConfirmations.Add(clientConf);
}
Versus
var clientsConfirmations = new List<ClientConfirmation>();
clientsConfirmations.AddRange(
Clients.Select(client =>
new ClientConfirmation(client.Id, client.Name, client.HasFlag ?? false)
)
);
I did a small test application to try to bench the two versions of the code. I'm not an expert at benchmarking so if you find problems in it, make me know.
I ran the bench (50k clients transformed 1k times 10 times so 50M iterations) and the results were
Iteration | First run | Add (ms) | AddRange (ms) | Delta |
---|---|---|---|---|
1 | Add | 25228 | 25385 | 157 |
2 | AddRange | 21561 | 24682 | 3121 |
3 | Add | 24182 | 25317 | 1135 |
4 | AddRange | 25647 | 24749 | - 898 |
5 | Add | 23347 | 24508 | 1161 |
6 | AddRange | 22699 | 24416 | 1717 |
7 | Add | 25819 | 24491 | -1328 |
8 | AddRange | 19830 | 22113 | 2283 |
9 | Add | 19376 | 19762 | 360 |
10 | AddRange | 25119 | 25220 | 101 |
Avg | - | 23280.8 | 24064.3 | 783.5 |
I would say that in that specific context, Add is marginally faster than AddRange but still a clear winner and that unless it is performance critical code, it is a matter of preference / keeping the code semantically clean.
Benchmark Code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Benchmarks
{
class Program
{
private class Client
{
public int Id { get; set; }
public string Name { get; set; }
public bool? HasFlag { get; set; }
}
private class ClientConfirmation
{
public int Id { get; set; }
public string Name { get; set; }
public bool Flag { get; set; }
public ClientConfirmation(int id, string name, bool flag)
{
Id = id;
Name = name;
Flag = flag;
}
}
static void Main(string[] args)
{
var rng = new Random();
for (var i = 1; i <= 10; i+=2)
{
AddVsAddRange(rng, i);
}
_ = Console.Read();
}
private static void AddVsAddRange(Random rng, int iteration)
{
var clients = new List<Client>();
for (var i = 0; i < 50000; i++)
{
clients.Add(new Client()
{
Id = rng.Next(),
Name = Guid.NewGuid().ToString(),
HasFlag = rng.Next(0, 3) switch
{
0 => false,
1 => true,
2 => null
}
});
}
Console.WriteLine($"| {iteration} | {Version1(clients)} | {Version2(clients)} |");
var v2 = Version2(clients);
var v1 = Version1(clients);
Console.WriteLine($"| {iteration+1} | {v1} | {v2} |");
clients.Clear();
}
private static long Version1(IEnumerable<Client> clients)
{
var sw = System.Diagnostics.Stopwatch.StartNew();
var clientsConfirmations = new List<ClientConfirmation>();
for (var i = 0; i < 1000; i++)
foreach (var client in clients)
{
bool flag = false;
if (client.HasFlag.HasValue)
{
flag = client.HasFlag.Value;
}
var clientConf = new ClientConfirmation(client.Id, client.Name, flag);
clientsConfirmations.Add(clientConf);
}
var total = sw.ElapsedMilliseconds;
clientsConfirmations.Clear();
return total;
}
private static long Version2(IEnumerable<Client> clients)
{
var sw = System.Diagnostics.Stopwatch.StartNew();
var clientsConfirmations = new List<ClientConfirmation>();
for (var i = 0; i < 1000; i++)
clientsConfirmations.AddRange(
clients.Select(client =>
new ClientConfirmation(client.Id, client.Name, client.HasFlag ?? false)
)
);
var total = sw.ElapsedMilliseconds;
clientsConfirmations.Clear();
return total;
}
}
}
Edit: Create list with the right size
As @PanagiotisKanavos mentionned, setting the list size at creation will prevent the code from doing multiple allocations to fit the data. I modified the bench to take it into account.
var clientsConfirmations = new List<ClientConfirmation>(clients.Count());
Here are the results:
Iteration | First run | Add (ms) | AddRange (ms) | Delta |
---|---|---|---|---|
1 | Add | 24150 | 23104 | -1046 |
2 | AddRange | 24602 | 20391 | -4211 |
3 | Add | 25723 | 25510 | - 213 |
4 | AddRange | 23510 | 23041 | - 469 |
5 | Add | 22621 | 20287 | -2334 |
6 | AddRange | 21417 | 22995 | 1578 |
7 | Add | 23371 | 25797 | 2426 |
8 | AddRange | 24457 | 24695 | 238 |
9 | Add | 24980 | 24686 | - 294 |
10 | AddRange | 25486 | 24038 | -1448 |
Avg | Add | 24031.7 | 23454.4 | - 577.3 |
In that case, the trend is reversed and AddRange gets the win but I can't figure out why...