Summary:
ordered_tags = tagged_items.annotate(freq=Count('tag_id')).order_by('-freq')
for i in ordered_tags:
print(i.tag_id, i.tag, i.freq)
This does not work equivalently to GROUP BY tag_id ORDER BY COUNT(tag_id) DESC
. Why?
Using Django ORM, I am trying to do something like:
SELECT tag_id, COUNT(tag_id) AS freq
FROM taggit_taggeditem
WHERE content_type_id IN (
SELECT id FROM django_content_type
WHERE app_label = 'reviews'
AND model IN ('problem', 'review', 'comment')
) AND (
object_id = 1
OR object_id IN (
SELECT id FROM review
WHERE problem_id = 1
) OR object_id IN (
SELECT c.id FROM comment AS c
INNER JOIN review AS r
ON r.id = c.review_id
)
)
GROUP BY tag_id
ORDER BY freq DESC;
So here's what I have contrived:
querydict_for_content_type_id = {
'current_app_label_query' : Q(app_label=ReviewsConfig.name),
'model_name_query' : Q(model__in=['problem', 'review', 'comment'])
}
# used in content_type_query
query_for_content_type_id = reduce(operator.__and__, querydict_for_content_type_id.values())
# Query relevant content_type from TaggedItem model.
content_type_query = Q(content_type_id__in=ContentType.objects.filter(query_for_content_type_id))
# Query relevant object_id from TaggedItem model.
object_query = Q(object_id=pk) | Q(object_id__in=problem.review_set.all())
for review in problem.review_set.all():
object_query |= Q(object_id__in=review.comment_set.all())
tagged_items = TaggedItem.objects.filter(content_type_query&object_query)
# JOIN Tag
# tags = tagged_items.select_related('tag')
# GROUP BY freq ORDER_BY freq DESC;
ordered_tags = tagged_items.annotate(freq=Count('tag_id'))#.order_by('-freq')
for i in ordered_tags:
print(i.tag_id, i.tag, i.freq)
Django ORM doesn't work as intended. Calling distinct
method on ordered_tags doen't work either.
Output:
18 tag1 1
19 tag2 1
20 tag3 1
...
18 tag1 1
Why am I getting 18 tag1 1
twice instead of 18 tag1 2
?
.annotate()
is not at all equivalent to GROUP BY
.
Generally using Django's ORM to translate SQL queries isn't a good idea. It is meant to help you solve a problem by translating your idea's into queries.
It would not make any sens to get TaggedItem instances as results of an SQL queries containing a GROUP BY
statement since those would not represent a row of the TaggedItem table.
To get a GROUP BY
statement, you need to get ride of the model layer by forcing the results to be dict.
qs = TaggedItem.objects.all()
qs = qs.values("tag_id") # Any field of the TaggedItem table you want as entries in the result dict
qs = qs.annotate(freq=Count("tag_id")) # This, combined with the above line, will produce a group by tag_id
qs = qs.order_by("-freq")