Search code examples
djangodjango-annotatedjango-aggregationdjango-subquery

Django Annotation Count with Subquery & OuterRef


I'm trying to create a high score statistic table/list for a quiz, where the table/list is supposed to be showing the percentage of (or total) correct guesses on a person which was to be guessed on. To elaborate further, these are the models which are used.

The Quiz model:

class Quiz(models.Model):
    participants = models.ManyToManyField(
        User,
        through="Participant",
        through_fields=("quiz", "correct_user"),
        blank=True,
        related_name="related_quiz",
    )
    fake_users = models.ManyToManyField(User, related_name="quiz_fakes")
    user_quizzed = models.ForeignKey(
        User, related_name="user_taking_quiz", on_delete=models.CASCADE, null=True
    )
    time_started = models.DateTimeField(default=timezone.now)
    time_end = models.DateTimeField(blank=True, null=True)
    final_score = models.IntegerField(blank=True, default=0)

This model does also have some properties; I deem them to be unrelated to the problem at hand.

The Participant model:

class Participant(models.Model):  # QuizAnswer FK -> QUIZ
    guessed_user = models.ForeignKey(
        User, on_delete=models.CASCADE, related_name="clicked_in_quiz", null=True
    )
    correct_user = models.ForeignKey(
        User, on_delete=models.CASCADE, related_name="solution_in_quiz", null=True
    )
    quiz = models.ForeignKey(
        Quiz, on_delete=models.CASCADE, related_name="participants_in_quiz"
    )

    @property
    def correct(self):
        return self.guessed_user == self.correct_user

To iterate through what I am trying to do, I'll try to explain how I'm thinking this should work:

  1. For a User in User.objects.all(), find all participant objects where the user.id equals correct_user(from participant model)
  2. For each participantobject, evaluate if correct_user==guessed_user
  3. Sum each participant object where the above comparison is True for the User, represented by a field sum_of_correct_guesses
  4. Return a queryset including all users with parameters [User, sum_of_correct_guesses]

^Now ideally this should be percentage_of_correct_guesses, but that is an afterthought which should be easy enough to change by doing sum_of_correct_guesses / sum n times of that person being a guess.

Now I've even made some pseudocode for a single person to illustrate to myself roughly how it should work using python arithmetics

# PYTHON PSEUDO QUERY ---------------------
person = get_object_or_404(User, pk=3)  # Example-person
y = Participant.objects.filter(
    correct_user=person
)  # Find participant-objects where person is used as guess
y_corr = []  # empty list to act as "queryset" in for-loop

for el in y:  # for each participant object
    if el.correct:  # if correct_user == guessed_user
        y_corr.append(el)  # add to queryset
y_percentage_corr = len(y_corr) / len(y)  # do arithmetic division
print("Percentage correct: ", y_percentage_corr)  # debug-display
# ---------------------------------------------

What I've tried (with no success so far), is to use an ExtensionWrapper with Count() and Q object:

percentage_correct_guesses = ExpressionWrapper(
Count("pk", filter=Q(clicked_in_quiz=F("id")), distinct=True)
/ Count("solution_in_quiz"),
output_field=fields.DecimalField())

all_users = (
User.objects.all().annotate(score=percentage_correct_guesses).order_by("score"))

Any help or directions to resources on how to do this is greatly appreciated :))


Solution

  • I found an answer while looking around for related problems: Django 1.11 Annotating a Subquery Aggregate

    What I've done is:

    • Create a filter with an OuterRef() which points to a User and checks if Useris the same as correct_person and also a comparison between guessed_person and correct_person, outputs a value correct_user in a queryset for all elements which the filter accepts.
    • Do an annotated count for how many occurrences there are of a correct_user in the filtered queryset.
    • Annotate User based on the annotated-count, this is the annotation that really drives the whole operation. Notice how OuterRef() and Subquery are used to tell the filter which user is supposed to be correct_user.

    Below is the code snippet which I made it work with, it looks very similar to the answer-post in the above linked question:

    from django.db.models import Count, OuterRef, Subquery, F, Q
    
    crit1 = Q(correct_user=OuterRef('pk'))
    crit2 = Q(correct_user=F('guessed_user'))
    compare_participants = Participant.objects.filter(crit1 & crit2).order_by().values('correct_user')
    count_occurrences = compare_participants.annotate(c=Count('*')).values('c')
    most_correctly_guessed_on = (
        User.objects.annotate(correct_clicks=Subquery(count_occurrences))
        .values('first_name', 'correct_clicks')
        .order_by('-correct_clicks')
    )
    return most_correctly_guessed_on
    

    This works wonderfully, thanks to Oli.