Search code examples
c#sql-serverazureazure-elastic-scale

Azure Elastic Database Merge GUI Key Shards


In Azure we have four Shards and i want to remove two of them as we do not need them anymore. The Data should be merged into the other two Shards.

I use a Listmap with GUIDs as Key to identifiy the Shard (in our application this is the UserId). In the tutorials i only found samples to merge Shards with the Range type. Is there a way to merge these type of shards in a faster way or do i have to write my own tool for this?

If the merge is performed automatically what will for example happen in the following case: The GUID to identify the Shard is the UserId, now this data is moved from Shard A to Shard B. There is another Table called Comments which has the UserId as ForeignKey. The PrimaryKey in this Table is a classic numeric auto increment value. What will happen to those values if they are moved from Shard A to Shard B? Will they be inserted and a new ID is assigned to them or will this not work at all?

Also there is some local FileStorage invloved which uses IDs in the Path so i will have to write my own tool anyway i think.

For that I took a look at the ShardMapManager but did not fully understand how it works. In the ShardMappingsGlobal Table is a Column called MappingId. But this is not the Guid/UserId which is stored in the Shard Database. How do i get the actual Guid which is used to identify the shard, in my case the UserId? I also did not find Methods to move data between Shards. What i would do now is Transfer the Data between the Shards with a tool by myself and then use the ListShardMap.UpdateMapping Method to set a new Shard for the value. At the end of the operation i would use ListShardMap.DeleteShard or is there a better way to do this?

EDIT:

I wrote my own tool to merge the shards but i get a strange exception now. here some code:

        Guid userKey = Guid.Parse(userId);
        ListShardMap<Guid> map = GetUserShardMap<Guid>();

        try
        {
            PointMapping<Guid> currentMapping = map.GetMappingForKey(userKey);

            PointMapping<Guid> mappingOffline = map.UpdateMapping(currentMapping, new PointMappingUpdate()
            {
                Status = MappingStatus.Offline
            });  
         }

The UpdateMapping causes the following exception:

Store Error: Error 515, Level 16, State 2, Procedure __ShardManagement.spBulkOperationShardMappingsLocal, Line 98, Message: Cannot insert the value NULL into column 'LockOwnerId', table __ShardManagement.ShardMappingsLocal

I do not understand why there is even an insert? I checked for the mappingId in the local and global Shardmapping tables and the mapping is there so no insert should be required in my opinion. I also took a look at the Code of the mentioned stored procedure spBulkOperationShardMappingsLocal here: https://github.com/Azure/elastic-db-tools/blob/master/Src/ElasticScale.Client/ShardManagement/Scripts/UpgradeShardMapManagerLocalFrom1.1To1.2.sql In the Insert statement the LockOwnerId is not passed as parameter so it can only fail. Currently i work with a testsetup because i do not want to play on the productive system of course. Maybe i made a mistake there but to me everything looks good. i would be very grateful about any hint regarding this error.


Solution

  • In the tutorials i only found samples to merge Shards with the Range type. Is there a way to merge these type of shards in a faster way or do i have to write my own tool for this?

    Yes, the Split-Merge tool can move data from both range and list shard maps. For a list shard map you can issue shardlet move requests for each key. The Split-Merge tool unfortunately has some complicated set up, last time it took me around an hour to configure. I know this is not great, I'll leave it up to you to determine whether it would take more or less time to write your own custom version.

    There is another Table called Comments which has the UserId as ForeignKey. The PrimaryKey in this Table is a classic numeric auto increment value. What will happen to those values if they are moved from Shard A to Shard B? Will they be inserted and a new ID is assigned to them or will this not work at all?

    The values of autoincrement columns are not copied over, they will be regenerated at the destination. So new ids will be assigned to these rows.

    For that I took a look at the ShardMapManager but did not fully understand how it works. In the ShardMappingsGlobal Table is a Column called MappingId. But this is not the Guid/UserId which is stored in the Shard Database. How do i get the actual Guid which is used to identify the shard, in my case the UserId?

    I would strongly suggest not trying to edit the ShardMapManager tables on your own, it's very easy to mess up. Editing ShardMapManager tables is precisely what the Elastic Database Tools library is designed to do.

    You can update the metadata for a mapping by using the ListShardMap.UpdatePointMapping method. Just to be clear, this only updates the ShardMapManager tables' knowledge of where the data should be for the key. Actually moving the mapping must be done by a higher layer.

    This is a high-level summary of what the Split-Merge service does:

    1. Lock the mapping to prevent concurrent update from another shard map management operation
    2. Mark the mapping offline with ListShardMap.UpdatePointMapping. This prevents data-directed routing with OpenConnectionForKey from being allowed to access data with that key. It also kills all current sessions on the shard to force them to reconnect, this ensure that there are no active connections operating on data with the now-offline key
    3. Move the underlying data, using the Shard Map's SchemaInfo to determine which tables need to be moved
    4. Update the mapping and mark it online with ListShardMap.UpdatePointMapping
    5. Unlock the mapping