Search code examples
pythondjangodjango-fixtures

Django 1.10 seed database without using fixtures


So I've looked at the documentation, as well as this SO question, and the django-seed package, but none of these seem to fit what I'm trying to do.

Basically, I want to programmatically seed my Games model from an external API, but all the information I can find seems to be reliant on generating a fixture first, which seems like an unnecessary step.

For example, in Ruby/Rails you can write directly to seed.rb and seed the database in anyway that's desired.

If a similar functionality available in Django, or do I need to generate the fixture first from the API, and then import it?


Solution

  • You can use a data migration for this. First create an empty migration for your app:

    $ python manage.py makemigrations yourappname --empty
    

    In your empty migration, create a function to load your data and add a migrations.RunPython operation. Here's a modified version of the one from the Django documentation on migrations:

    from __future__ import unicode_literals
    from django.db import migrations
    
    def stream_from_api():
        ...
    
    def load_data(apps, schema_editor):
        # We can't import the Person model directly as it may be a newer
        # version than this migration expects. We use the historical version.
        Person = apps.get_model('yourappname', 'Person')
    
        for item in stream_from_api():
            person = Person(first=item['first'], last=item['last'], age=item['age'])
            person.save()
    
    class Migration(migrations.Migration):
        dependencies = [('yourappname', '0009_something')]
        operations = [migrations.RunPython(load_data)]
    

    If you have a lot of simple data, you might benefit from the bulk-creation methods:

    from __future__ import unicode_literals
    from django.db import migrations
    
    def stream_from_api():
        ...
    
    def load_data(apps, schema_editor):
        # We can't import the Person model directly as it may be a newer
        # version than this migration expects. We use the historical version.
        Person = apps.get_model('yourappname', 'Person')
    
        def stream_people():
            for item in stream_from_api():
                yield Person(first=item['first'], last=item['last'], age=item['age'])
    
        # Adjust (or remove) the batch size depending on your needs.
        # You won't be able to use this method if your objects depend on one-another
        Person.objects.bulk_create(stream_people(), batch_size=10000)
    
    class Migration(migrations.Migration):
        dependencies = [('yourappname', '0009_something')]
        operations = [migrations.RunPython(load_data)]
    

    Migrations have the added benefit of being automatically enclosed in a transaction, so you can stop the migration at any time and it won't leave your database in an inconsistent state.