Search code examples
asp.netazureazure-elastic-scale

Azure Elastic Scale Sharding Key


I am moving my datamodel to Azure Elastic Scale.

After some testing and some experiences I fall in love if it, it is simple and with that kind of approach the code remains clean and easy to maintain.

I just have one big question, where is the Sharding key defined? I cannot find info on the sample downloaded from Visual Studio and I can beat that this is a straight forward answer.

In the sample offered by Microsoft the default sharding key is CustomerId but i cannot find where the reference to that key takes place.

Could it be in ShardMapName from configuration file?

Thanks in advance.


Solution

  • There is no explicit link between the the sharding key in the SQL schema and its usage (in code).

    So in the Getting Started sample, the Customers and Orders table both contain a CustomerId column, and you can see that in DataDependentRoutingSample.cs when we access these tables we make sure to provide the same customerId value to the shardMap.OpenConnectionForKey method that we then use for customerId column (both in the SELECT and INSERT statements) in the following query.

    // Looks up the key in the shard map and opens a connection to the shard
    using (SqlConnection conn = shardMap.OpenConnectionForKey(customerId, credentialsConnectionString))
    {
        // Create a simple command that will insert or update the customer information
        SqlCommand cmd = conn.CreateCommand();
        cmd.CommandText = @"
        IF EXISTS (SELECT 1 FROM Customers WHERE CustomerId = @customerId)
            UPDATE Customers
                SET Name = @name, RegionId = @regionId
                WHERE CustomerId = @customerId
        ELSE
            INSERT INTO Customers (CustomerId, Name, RegionId)
            VALUES (@customerId, @name, @regionId)";
        cmd.Parameters.AddWithValue("@customerId", customerId);
        cmd.Parameters.AddWithValue("@name", name);
        cmd.Parameters.AddWithValue("@regionId", regionId);
        cmd.CommandTimeout = 60;
    
        // Execute the command
        cmd.ExecuteNonQuery();
    }
    

    In other words, when you provide a certain key value in the OpenConnectionForKey call, it's your responsibility to make sure that all SQL queries with that connection are restricted to that key value, otherwise you may end up with incorrect results (e.g. if it was a SELECT query) or rows living on the wrong shard (e.g. if it was an INSERT query).

    It's possible to fix this safety issue by using the new Row-Level Security feature. We have a sample called Entity Framework Multi-Tenant Shards that demonstrates how to combine Shard Maps with Row-Level Security. The relevant code is in ElasticScaleContext.cs:

    SqlConnection conn = null; 
    try 
    { 
        // Ask shard map to broker a validated connection for the given key 
        conn = shardMap.OpenConnectionForKey(shardingKey, connectionStr, ConnectionOptions.Validate); 
    
        // Set CONTEXT_INFO to shardingKey to enable Row-Level Security filtering 
        SqlCommand cmd = conn.CreateCommand(); 
        cmd.CommandText = @"SET CONTEXT_INFO @shardingKey"; 
        cmd.Parameters.AddWithValue("@shardingKey", shardingKey); 
        cmd.ExecuteNonQuery(); 
    
        return conn; 
    } 
    catch (Exception) 
    { 
        if (conn != null) 
        { 
            conn.Dispose(); 
        } 
    
        throw; 
    }
    

    Thanks for your great question!