Search code examples
pythondjangodjango-annotate

Django annotate() is not adding up the number of entries but is instead repeating them


Background:

The Amazon Kindle PaperWhite stores the words we lookup while reading into a sqlite3 database called vocab.db. I am working on a small kindle companion app that takes this db file and imports it into a django table for various processing. I have done this step already.

What I would like to do:

I would like to query my table KindleLookups for my most difficult words (ex: how many times have I looked up a specific word). I would ultimately like to present this data in a table ordered by highest count.

Desired Result:

Word Lookup Count
Reverberated 3
Troubadour 1
Corrugated 1

My result (undesired):

Here Reverberated is being repeated three times each with a lookup count of one, instead of one time with three lookup count.

Word Lookup Count
Reverberated 1
Reverberated 1
Reverberated 1
Troubadour 1
Corrugated 1

Model:

class KindleLookups(TimeStampedModel):
    book = models.ForeignKey(KindleBookInfo, on_delete=models.CASCADE)
    word = models.ForeignKey(KindleWords, on_delete=models.CASCADE)
    ...

class KindleWords(TimeStampedModel):
    word_key = models.CharField(max_length=255, unique=True)
    word = models.CharField(max_length=255)
    ...

I am trying to accomplish this using annotate(), but this is repeating the rows instead of adding them up for some reason.

context['lookup_counts'] = KindleLookups.objects.annotate(word_count=Count("word"))

I then thought that I needed to annotate on the actual word, but nothing seems to have changed.

context['lookup_counts'] = KindleLookups.objects.annotate(word_count=Count("word__word"))

Template:

<tbody>
{% for word in lookup_counts %}
<tr>
    <td>{{ word.word }}</td>
    <td>{{ word.word_count }}</td>
</tr>
{% endfor %}
</tbody>

So I am hoping you can answer my questions:

Questions

  1. What is actually happening with my annotate()? Why is it repeating the rows instead of counting them, despite me using Count()?
  2. Would counting on the "word" and "word__word" be the exact same thing?
  3. Could this issue be somehow related with my KindleLookups's __str__ method returning the self.usage (the sentence the word was found in) and not the word?

Solution

  • You should annotate the KindleWords model instead, so:

    context['lookup_counts'] = KindleWords.objects.annotate(
        word_count=Count('kindlelookups')
    )

    The KindleWordss that arise from this queryset will have an extra attribute .word_count that determines how many times that KindleWords is used in the KindleLookups.


    Note: normally a Django model is given a singular name, so KindleWord instead of KindleWords.