I got the idea of a Substring struct
from this post and this one. The second post has the implementation of .net's String.GetHashCode(). (I'm not sure which version of .net this is from.)
Here is the implementation. (GetHashCode is taken from the second source listed above.)
public struct Substring
{
private string String;
private int Offset;
public int Length { get; private set; }
public char this[int index] { get { return String[Offset + index]; } }
public Substring(string str, int offset, int len) : this()
{
String = str;
Offset = offset;
Length = len;
}
/// <summary>
/// See http://www.dotnetperls.com/gethashcode
/// </summary>
/// <returns></returns>
public unsafe override int GetHashCode()
{
fixed (char* str = String + Offset)
{
char* chPtr = str;
int num = 352654597;
int num2 = num;
int* numPtr = (int*)chPtr;
for (int i = Length; i > 0; i -= 4)
{
num = (((num << 5) + num) + (num >> 27)) ^ numPtr[0];
if (i <= 2)
{
break;
}
num2 = (((num2 << 5) + num2) + (num2 >> 27)) ^ numPtr[1];
numPtr += 2;
}
return (num + (num2 * 1566083941));
}
}
}
Here's a unit test:
[Test]
public void GetHashCode_IsAsFastAsString()
{
var s = "The quick brown fox";
var sub = new Substring(s, 1, 5);
var t = "quick";
var sum = 0;
sum += sub.GetHashCode(); // make sure GetHashCode is jitted
var count = 100000000;
var sw = Stopwatch.StartNew();
for (var i = 0; i < count; ++i)
sum += t.GetHashCode();
var t1 = sw.Elapsed;
sw = Stopwatch.StartNew();
for (var i = 0; i < count; ++i)
sum += sub.GetHashCode();
var t2 = sw.Elapsed;
Debug.WriteLine(sum.ToString()); // make sure we use the return value
var m1 = t1.Milliseconds;
var m2 = t2.Milliseconds;
Assert.IsTrue(m2 <= m1); // fat chance
}
The problem is that m1 is 10 milliseconds and m2 is 190 milliseconds. (Note: this is with 1000000 iterations.) FYI, I ran this on .net 4.5 64 bit Release build with Optimizations turned on.
At this time I'm disappointed about not getting parity performance, but this was an excellent use of my time in learning about some of the perils and pitfalls of optimizing C#.