Loading initial data with Django 1.7 and data migrations
Update: See @GwynBleidD's comment below for the problems this solution can cause, and see @Rockallite's answer below for an approach that's more durable to future model changes.
Assuming you have a fixture file in <yourapp>/fixtures/initial_data.json
Create your empty migration:
In Django 1.7:
python manage.py makemigrations --empty <yourapp>
In Django 1.8+, you can provide a name:
python manage.py makemigrations --empty <yourapp> --name load_intial_data
Edit your migration file
<yourapp>/migrations/0002_auto_xxx.py
2.1. Custom implementation, inspired by Django'
loaddata
(initial answer):import os from sys import path from django.core import serializers fixture_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '../fixtures')) fixture_filename = 'initial_data.json' def load_fixture(apps, schema_editor): fixture_file = os.path.join(fixture_dir, fixture_filename) fixture = open(fixture_file, 'rb') objects = serializers.deserialize('json', fixture, ignorenonexistent=True) for obj in objects: obj.save() fixture.close() def unload_fixture(apps, schema_editor): "Brutally deleting all entries for this model..." MyModel = apps.get_model("yourapp", "ModelName") MyModel.objects.all().delete() class Migration(migrations.Migration): dependencies = [ ('yourapp', '0001_initial'), ] operations = [ migrations.RunPython(load_fixture, reverse_code=unload_fixture), ]
2.2. A simpler solution for
load_fixture
(per @juliocesar's suggestion):from django.core.management import call_command fixture_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '../fixtures')) fixture_filename = 'initial_data.json' def load_fixture(apps, schema_editor): fixture_file = os.path.join(fixture_dir, fixture_filename) call_command('loaddata', fixture_file)
Useful if you want to use a custom directory.
2.3. Simplest: calling
loaddata
withapp_label
will load fixtures from the<yourapp>
'sfixtures
dir automatically :from django.core.management import call_command fixture = 'initial_data' def load_fixture(apps, schema_editor): call_command('loaddata', fixture, app_label='yourapp')
If you don't specify
app_label
, loaddata will try to loadfixture
filename from all apps fixtures directories (which you probably don't want).Run it
python manage.py migrate <yourapp>
Short version
You should NOT use loaddata
management command directly in a data migration.
# Bad example for a data migration
from django.db import migrations
from django.core.management import call_command
def load_fixture(apps, schema_editor):
# No, it's wrong. DON'T DO THIS!
call_command('loaddata', 'your_data.json', app_label='yourapp')
class Migration(migrations.Migration):
dependencies = [
# Dependencies to other migrations
]
operations = [
migrations.RunPython(load_fixture),
]
Long version
loaddata
utilizes django.core.serializers.python.Deserializer
which uses the most up-to-date models to deserialize historical data in a migration. That's incorrect behavior.
For example, supposed that there is a data migration which utilizes loaddata
management command to load data from a fixture, and it's already applied on your development environment.
Later, you decide to add a new required field to the corresponding model, so you do it and make a new migration against your updated model (and possibly provide a one-off value to the new field when ./manage.py makemigrations
prompts you).
You run the next migration, and all is well.
Finally, you're done developing your Django application, and you deploy it on the production server. Now it's time for you to run the whole migrations from scratch on the production environment.
However, the data migration fails. That's because the deserialized model from loaddata
command, which represents the current code, can't be saved with empty data for the new required field you added. The original fixture lacks necessary data for it!
But even if you update the fixture with required data for the new field, the data migration still fails. When the data migration is running, the next migration which adds the corresponding column to the database, is not applied yet. You can't save data to a column which does not exist!
Conclusion: in a data migration, the loaddata
command introduces potential inconsistency between the model and the database. You should definitely NOT use it directly in a data migration.
The Solution
loaddata
command relies on django.core.serializers.python._get_model
function to get the corresponding model from a fixture, which will return the most up-to-date version of a model. We need to monkey-patch it so it gets the historical model.
(The following code works for Django 1.8.x)
# Good example for a data migration
from django.db import migrations
from django.core.serializers import base, python
from django.core.management import call_command
def load_fixture(apps, schema_editor):
# Save the old _get_model() function
old_get_model = python._get_model
# Define new _get_model() function here, which utilizes the apps argument to
# get the historical version of a model. This piece of code is directly stolen
# from django.core.serializers.python._get_model, unchanged. However, here it
# has a different context, specifically, the apps variable.
def _get_model(model_identifier):
try:
return apps.get_model(model_identifier)
except (LookupError, TypeError):
raise base.DeserializationError("Invalid model identifier: '%s'" % model_identifier)
# Replace the _get_model() function on the module, so loaddata can utilize it.
python._get_model = _get_model
try:
# Call loaddata command
call_command('loaddata', 'your_data.json', app_label='yourapp')
finally:
# Restore old _get_model() function
python._get_model = old_get_model
class Migration(migrations.Migration):
dependencies = [
# Dependencies to other migrations
]
operations = [
migrations.RunPython(load_fixture),
]
In order to give your database some initial data, write a data migration. In the data migration, use the RunPython function to load your data.
Don't write any loaddata command as this way is deprecated.
Your data migrations will be run only once. The migrations are an ordered sequence of migrations. When the 003_xxxx.py migrations is run, django migrations writes in the database that this app is migrated until this one (003), and will run the following migrations only.
Inspired by some of the comments (namely n__o's) and the fact that I have a lot of initial_data.*
files spread out over multiple apps I decided to create a Django app that would facilitate the creation of these data migrations.
Using django-migration-fixture you can simply run the following management command and it will search through all your INSTALLED_APPS
for initial_data.*
files and turn them into data migrations.
./manage.py create_initial_data_fixtures
Migrations for 'eggs':
0002_auto_20150107_0817.py:
Migrations for 'sausage':
Ignoring 'initial_data.yaml' - migration already exists.
Migrations for 'foo':
Ignoring 'initial_data.yaml' - not migrated.
See django-migration-fixture for install/usage instructions.