What's the fastest way to parse strings in C#?
Currently I'm just using string indexing (string[index]
) and the code runs reasonably, but I can't help but think that the continuous range checking that the index accessor does must be adding something.
So, I'm wondering what techniques I should consider to give it a boost. These are my initial thoughts/questions:
string.IndexOf()
and IndexOfAny()
to find characters of interest. Are these faster than manually scanning a string by string[index]
?NB: I should say, the strings I'm parsing could be reasonably large (say 30k) and in a custom format for which there is no standard .NET parser. Also, performance of this code is not super critical, so this partly just a theoretical question of curiosity.
30k is not what I would consider to be large. Before getting excited, I would profile. The indexer should be fine for the best balance of flexibility and safety.
For example, to create a 128k string (and a separate array of the same size), fill it with junk (including the time to handle Random
) and sum all the character code-points via the indexer takes... 3ms:
var watch = Stopwatch.StartNew();
char[] chars = new char[128 * 1024];
Random rand = new Random(); // fill with junk
for (int i = 0; i < chars.Length; i++) chars[i] =
(char) ((int) 'a' + rand.Next(26));
int sum = 0;
string s = new string(chars);
int len = s.Length;
for(int i = 0 ; i < len ; i++)
{
sum += (int) chars[i];
}
watch.Stop();
Console.WriteLine(sum);
Console.WriteLine(watch.ElapsedMilliseconds + "ms");
Console.ReadLine();
For files that are actually large, a reader approach should be used - StreamReader
etc.