Django And MySQL: Model Saving Order Is Important

 Nov. 19, 2021     0 comments

Often in one Django view you need to save multiple models that have "parent-child" relationship. For example, your view may receive data from a traditional HTML form that includes an inline formset or a JSON with a nested array of child objects from a JavaScript frontend app. In this case you need to save a parent model and create/update/delete some child models. Of course, to make the whole operation atomic all model saves should be wrapped in a database transaction to ensure the consistency of saved data. Let's assume we are using MySQL or MariaDB as a database backend for our Django project. And the question arises: "in which order we should save our models?". "What should we save first: parent or children?"

To answer this question let's conduct a small experiment. Let's take Django 2 models — Author and Book — connected via a foreign key relationship:

 Click to show
from django.db import models


class Author(models.Model):
    first_name = models.CharField(max_length=190)
    last_name = models.CharField(max_length=190)
    birth_date = models.DateField()
    country = models.CharField(max_length=190, blank=True, null=True)

    class Meta:
        ordering = ['last_name', 'first_name']

    def __str__(self):
        return f'{self.first_name} {self.last_name}'
    
    def __repr__(self):
        return f'<Author [{str(self)}]>'


class Book(models.Model):
    title = models.CharField(max_length=190)
    author = models.ForeignKey(Author, on_delete=models.CASCADE)
    is_read = models.BooleanField(default=False)

    class Meta:
        ordering = ['title']

    def __str__(self):
        return f'"{self.title}"'
    
    def __repr__(self):
        return f'<Book [{str(self)}]>'

Now let's open 2 Django consoles and execute the following commands:

Console 1:
Python 3.10.0 (default, Oct  4 2021, 22:09:55) [GCC 9.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.29.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from django.db import transaction

In [2]: from books.models import Author, Book

In [3]: transaction.set_autocommit(False)  # Start transaction 1

In [4]: author = Author.objects.get(id=1)

In [5]: author
Out[5]: <Author [Stephen King]>

In [6]: Book.objects.create(title='Carrie', author=author)  # Call 1st
Out[6]: <Book ["Carrie"]>

In [7]: author.country = 'USA'

In [8]: author.save()  # Call 3rd
Console 2:
Python 3.10.0 (default, Oct  4 2021, 22:09:55) [GCC 9.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.29.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from django.db import transaction

In [2]: from books.models import Author, Book

In [3]: transaction.set_autocommit(False)  # Start transaction 2

In [4]: author = Author.objects.get(id=1)

In [5]: author
Out[5]: <Author [Stephen King]>

In [6]: Book.objects.create(title="Salem's Lot", author=author)  # Call 2nd
Out[6]: <Book ["Salem's Lot"]>

In [7]: author.birth_date = '1947-09-21'

In [8]: author.save()  # Call 4th
---------------------------------------------------------------------------
OperationalError                          Traceback (most recent call last)
# Long traceback skipped ...
OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')

In [9]: 

Note the sequence of calls in the comments. Why has it happened? The thing is that when saving a child model with a foreign key relationship to the parent MySQL/MariaDB puts an implicit shared lock on the parent database row, that is, the parent Django model instance. "Shared" means that several transactions can hold it and read the locked row. It is done to guarantee that the foreign key constraint will be valid, that is, the parent row will exist when the transaction is committed. When you save the parent model itself MySQL/MariaDB tries to acquire another lock on the corresponding database row — this time an exclusive one that prevents other transactions from modifying or locking the parent row. But in our case transaction 2 already holds a shared lock on the parent so transaction 1 has to wait until transaction 2 releases the shared lock. And when transaction 2 in its turn saves the parent model it also tries to acquire an exclusive lock on the parent row that is already locked by the shared lock in transaction 1. The result is a deadlock, that is, the situation when two transaction wait for each other to release a lock on the same database row. The database backend detects such situation, aborts transaction 2 and raises an error that translates into a Python OperationalError exception.

So the answer to our question is this: when you save multiple Django models that have parent-child relationships via foreign keys parent models must be saved first to prevent the situation when several transactions acquire shared locks on the same database rows. If a parent model is saved first MySQL/MariaDB will put an exclusive lock on the corresponding row so other transactions will have to wait until the first transaction finishes and they won't be able to acquire a shared lock simultaneously with the first transaction.

When your Django app doesn't have many users that work in parallel this consideration does not matter much because the likelihood of 2 users modifying the same items is pretty low, but when the number of users and their activity increase the probability of users working with the same data at the same time increases as well. And correct organizing of database operations helps prevent database deadlocks in parallel transactions that results in better user experience.

  DjangoMariaDBMySQLPython