Annotating a Django QuerySet with the count of a Subquery

I'm building a job board. Each Job could have several associated Location objects.

I have designed my Location and Job models as follows:

class Location(BaseModel):
    slug = models.CharField(unique=True)
    city = models.OneToOneField(to="City", null=True)
    state = models.ForeignKey(to="State", null=True)
    country = models.ForeignKey(to="Country")


class Job(BaseModel):
    title = models.CharField()
    description = models.TextField()
    locations = models.ManyToManyField(
        to="Location",
        related_name="jobs",
        through="JobLocation",
    )

So, a Location object has the flexibility to refer to our idea of a country (like United States), a state (like New York) or a city (like Manhattan). I populate the slug field of Location model like so:

If Location object is a country, I use the country name.
- united-states
If Location object is a state, I use the (state name + country name).
- new-york-united-states
If Location object is a city, I use (city name + state name + country name).
- manhattan-new-york-united-states

With slug field populated in this manner, I can simplify querying all the jobs in a particular location using the endswith lookup. For example, if there's a Python job in Manhattan, NY and a React job in Brooklyn, NY, I can get all the jobs in the state of New York like so:

Job.objects.filter(locations__slug__endswith="new-york-united-states").distinct()

Now, I would like to get a list of all my Location objects, and have each Location item annotated with number_of_jobs. This number_of_jobs should be the count of all jobs within that particular Location.

For example:

Manhattan Location object should have number_of_jobs=1 (The Python job)
Brooklyn Location object should have number_of_jobs=1 (The React job)
New York Location object should have number_of_jobs=2 (Python job + React job)

Attempt 1:

The simplest solution, which is to Count the Location.jobs doesn't work. It shows 1 job for Manhattan and 1 job for Brooklyn, but 0 jobs for New York.

Location.objects.annotate(number_of_jobs=Count("jobs", distinct=True))
# 1 job for Manhattan
# 1 job for Brooklyn
# 0 job for New York

Attempt 2:

subquery = Job.objects.filter(locations__slug__endswith=OuterRef("slug")).distinct().count()

Location.objects.annotate(number_of_jobs=Subquery(subquery))

This also doesn't work, because count() executes the queryset immediately.

Attempt 3:

Based on Yuvraj's response, I managed to get absolutely close to the answer, but not quite.

jobs = Job.objects.filter(locations__slug__endswith=OuterRef("slug"))

subquery = jobs.annotate(job_count=Count("id", distinct=True)).values("job_count")[:1]

Location.objects.annotate(number_of_jobs=Subquery(subquery))

If I run these expressions using debugsqlshell of django-debug-toolbar, and I copy the SQL that Django creates, and remove the GROUP BY clause, I get my answer 100% correctly. I don't yet understand why Django adds GROUP BY and why removing it gives the correct answer.

I'm at my wits' end with this queryset. I'd appreciate any help.

Solution

First off, thank you Yuvraj for your response. This answer builds on top of yours.

Turns out, Django automatically adds a GROUP BY clause when you use the Count aggregate function. I don't fully understand why it adds the GROUP BY clause, and why removing it returns the correct results in my case, so I won't speculate. But here's the answer, also based on Sergei's comment on this answer:

class NonAggregateCount(Count):
    # Gets rid of Django's automatic GROUP BY clause
    contains_aggregate = False

jobs = Job.objects.filter(locations__slug__endswith=OuterRef("slug"))

subquery = jobs.annotate(job_count=NonAggregateCount("id", distinct=True)).values("job_count")

Location.objects.annotate(number_of_jobs=Subquery(subquery))