Search code examples
pythondjangodjango-modelsmptt

Duplicate records when treeforeignkey is null in django mptt


I have this model:

class Genre(MPTTModel):
    id = models.CharField(max_length=100)
    name = models.CharField(max_length=100)
    parent = TreeForeignKey(
        'self', 
        null=True, 
        blank=True, 
        related_name='subgenre'
    )

    def __str__(self):
        return self.name

    class Meta:
        unique_together = (('id', 'parent'),)

I didn't want to have duplicate records, so I'm using unique_together with the id and the TreeForeignKey.

Even with unique_together, I'm still able to add duplicates when I set the parent to null. How can I avoid that?


Solution

  • This is an SQL design decision.

    SQL 2011 draft, page 474 reads:

    If there are no two rows in T such that the value of each column in one row is non-null and is not distinct from the value of the corresponding column in the other row, then the result of the is True; otherwise, the result of the is False.

    This means that two NULL values are considered distinct when it comes to the unique constraint. This contradicts the NULL data type definition in page 41:

    Two null values are not distinct.

    A null value and a non-null value are distinct.

    Two non-null values are distinct if the General Rules of Subclause 8.15, “”, return True.

    General Rules of Subclause 8.15 says:

    If both V1 and V2 are the null value, then the result is False.

    To summarize:

    When it comes to the data type, the "distinctiveness" of the two nulls is False meaning NULL == NULL.

    But the unique constraint at the table level says otherwise: NULL != NULL. There can be many NULLs in a field of a table that says they should be unique.

    The Django ticket tracking this is #1751 unique_together does not work when any of the listed fields contains a FK. The workaround is to define your own .validate_unique model method as mentioned in the documentation.

    from django.core.exceptions import ValidationError
    from django.db import transaction
    
    def validate_unique(self, exclude=None):
        with transaction.atomic():
            if Genre.objects.select_for_update().filter(parent=self.parent, id=self.id).exists():
                params = {
                    'model_name': _('Genre'),
                    'field_labels': _('Parent and ID')
                }
                raise ValidationError(
                    message=_(
                        '%(model_name)s with this %(field_labels)s already exists.'
                    ), code='unique_together', params=params,
                )
    

    The select_for_update creates a lock to avoid a race condition.

    This solution works for form submits, it doesn't work when accessing the Genre.objects.create() method directly. In that situation you need to create the Genre instance in three steps:

    genre = Genre(id='id1')
    genre.validate_unique()
    genre.save()