Search code examples
djangodjango-modelsforeign-keysrelational-databasemanytomanyfield

Spanning multi-valued relationships


I read the django documentation and I found something confusing! https://docs.djangoproject.com/en/3.2/topics/db/queries/#spanning-multi-valued-relationships.

It is stated that "When you are filtering an object based on a ManyToManyField or a reverse ForeignKey, there are two different sorts of filter you may be interested in"

"To select all blogs that contain entries with both “Lennon” in the headline and that were published in 2008 (the same entry satisfying both conditions), we would write:"

Blog.objects.filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)

"To select all blogs that contain an entry with “Lennon” in the headline as well as an entry that was published in 2008, we would write:"

Blog.objects.filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)

"Suppose there is only one blog that had both entries containing “Lennon” and entries from 2008, but that none of the entries from 2008 contained “Lennon”. The first query would not return any blogs, but the second query would return that one blog"

So, by reading this guide, multiple questions have been created in my mind:

1- What is difference between these two types of filtering? what I know is that both of them must return blogs that contain both 'Lennon' in headline and was published in 2008. I can't find out the point.

2- The Document says this rule is just for ManyToManyField or a reverse ForeignKey. The question is, Why?


Solution

  • What is the difference between these two types of filtering?

    Imagine that you have a Blog with two (or more) entries. One of the entries has as headline some text that contains the word 'Lennon' but is not written in 2008, furthermore the blog has an entry without the word 'Lennon' in the headline, but written in the year 2008.

    The first query with .filter(entry__headline__contains='Lennon', entry__pub_date__year=2008) looks for an entry that contains 'Lennon' in the headline and that same entry should be written in 2008. For our sample data defined in the first paragraph, this would mean that this Blog is not selected, since the two conditions apply to the same entry.

    If on the other hand you write this as .filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008), Django will make two LEFT OUTER JOINs, this thus means it looks for a blog where one if the entries has a headline containing Lennon, and an entry (this can be a different or the same entry) has been written in 2008. This will thus retrieve the Blog defined in the first paragraph.

    The Document says this rule is just for ManyToManyField or a reverse ForeignKey. The question is, Why?

    Because a reverse ForeignKey is a one-to-many relation, and a ManyToManyField spans a many-to-many relation. The …-to-many is important here. If that would be a …-to-one, it is simply impossible to make a Blog with two or more entries: if a Blog has a ForeignKey to an Entry model, then that means that the blog can only have one entry, not multiple ones. Since this thus limits the number of entries to one, it is impossible to have two or more entries and thus the filtering will not differentiate between the two cases.