Search code examples
.netsql-serversqlclr

Why would a function output different results when executing as a SQL Server CLR UDF?


I'm using a .Net DLL to hash strings so later I can compare them.

I've wrapped the call to that function inside a SQL Server CLR UDF and published on the server.

Now, when I execute the function the output is different than the one I get when running a Console Application.

Function signature is as follows:

[Microsoft.SqlServer.Server.SqlFunction]
public static SqlInt32 CalculateHash(SqlString input)
{
    string unwrappedValue = input.Value.Normalize();
    int hash = HashCalculator.Calculate(unwrappedValue);
    return new SqlInt32(hash);
}

As you can see, I'm unwrapping and normalizing the string before even calculating the hash. So, I would expect the results to be the same no matter where I'm calling that code from.

Given the string Test 123 i'm getting:

-387939562 - When running from a Console Application
137570918  - When calling from SQL Server

SQL Server UDF does not allow Debug.Print (or similar) calls.

And for some reason, Visual Studio won't stop at the *.cs files breakpoints when debugging the UDF (but that's a different problem I'm still trying to igure out).

My question is: Why would the same function give two different results? What goes with the UDF that could be causing this? I even tried changing collation from the database but it does not affect the function's result.

Edit: I managed the step into the code when running the sql query and found out that the method String.GetHashCode() is returning a different value when running inside SQL Server. In any case, I'd assumed any charset-like problem would go away since I normalize the string before using it.

Edit 2: Given that GetHashCode seemed to be the problem, I've checked the code for it here:

How is GetHashCode() of C# string implemented?

And found out that the .net implementation differs from running 32bits x 64bits. When I put my Console to run in 64bit mode, the output result is the same I got in SQL Server.


Solution

  • Once I managed to debug the UDF running on SQL Server I was finally able to figure out that the algorithm I'm using to hash a given string relies on .net's GetHashCode method.

    And at least for the String class, as per this question, the original implementation gives different results for 32 and 64 bit platforms.

    So, when I was running on the Console application to test the functionality, it ended up running as a 32 bit application. But SQL Server installed is 64 bits, forcing the other implementation for GetHashCode to run.

    I was able to replicate SQL Server's results by forcing the Console application to run as a 64bit application.

    Then, it was only a matter of tweaking the parameters so that 32bit and 64bit platforms give a similar (but not equal) result.