I am testing different strategies for a incoming breaking change. The problem is that each experiment would carry some costs in Azure. The data is huge, and can have some inconsistencies due to many years with fixes and transactions before I even knew the company.
I need to change a column in a table with million of records and dozens of indexes. This will have a big downtime.
ALTER TABLE X ALTER COLUMN A1 decimal(15, 4) --The original column is int
One of the initial ideas (Now I know this is not possible) is to have a secondary replica, do the changes there, and, when changes finish, swap primary with secondary... zero or almost zero downtime. I am referring to a "live", redundant replica, not just a "copy"
EDIT: Throwing new ideas:
I have tried this, but takes AGES to complete. Also, some null and FK violations make the process to fail after processing for several hours. "Resuming" could be an option but it makes the process slower with each execution. Without some kind of "Resume", each failure have to be repeated from scratch
An acceptable improvement could be to IGNORE the errors (but create logs, of course) and apply fixes after migration. But afaik, AzureSql (nor SqlServer) doesn't offer an "ignore" option
Also tried this one. Some indexes take AGES to complete. But for now seems to be the best bet. There is a possible variation by applying ROW COMPRESSION before the datatype change, but I think it will not improve the real deal: index re-creation
A friend thought of a variation on this, which is to duplicate the needed indexes ONLINE for the column copy. In the meanwhile, trigger all changes on source column to the column copy.
For any of the mentioned strategies, some gain can be obtained by increasing the processing power. But, anyway, we consider to increase the power with any of the approaches, therefore this is common for all solutions
After many tests and backups, we finally used the following aproach:
--> You can consider it is finally done!
I selected the other answer, because I think it could be also useful in most situations. This one has more steps but has a very little downtime and is reversible at any step but the last.