Search code examples
pythondjangodatabase-migrationdata-migration

Did I write data migration for data type change in a correct way?


Say, I have a model TestModel:

class TestModel(models.Model):
    field1 = models.CharField(max_length=255)
    field2 = models.IntegerField()

    def __str__(self):
        return f"{self.field1}"

But I need to change the type of field2 to Text now. In order not to lose the data in TestModel model, I need to write a data migration.

So I create a new model NewTestModel:

class NewTestModel(models.Model):
    field1 = models.CharField(max_length=255)
    field2 = models.TextField()

    def __str__(self):
        return f"{self.field1}"

Run python manage.py makemigrations command

In 0006_newtestmodel.py migration file I add copy_data function and run it using migrations.RunPython(copy_data)

from django.db import migrations, models

def copy_data(apps, database_schema):
    TestModel = apps.get_model("data_migrations", "TestModel")
    NewTestModel = apps.get_model("data_migrations", "NewTestModel")

    for old_object in TestModel.objects.all():
        new_object = NewTestModel(
            field1 = old_object.field1,
            field2 = old_object.field2
        )
        new_object.save()


class Migration(migrations.Migration):

    dependencies = [
        ('data_migrations', '0005_testmodel'),
    ]

    operations = [
        migrations.CreateModel(
            name='NewTestModel',
            fields=[
                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('field1', models.CharField(max_length=255)),
                ('field2', models.TextField()),
            ],
        ),
        migrations.RunPython(copy_data)
    ]

Then I delete TestModel model and run the migration commands.

After that I rename NewTestModel to TestModel and once again run the migration commands.

Everything worked out as it was supposed to.

Did I do it right?


Solution

  • Did I do it right?

    Yes, but you can do it faster. Indeed, here you make one query per record to copy, that means that if the data to migrate is huge, it will take hours, maybe days. And in between, the database can go down.

    We can copy in bulk with:

    def copy_data(apps, database_schema):
        TestModel = apps.get_model('data_migrations', 'TestModel')
        NewTestModel = apps.get_model('data_migrations', 'NewTestModel')
    
        NewTestModel.objects.bulk_create(
            [
                NewTestModel(field1=old_object.field1, field2=old_object.field2)
                for old_object in TestModel.objects.all()
            ]
        )