This is a tough one. I have a django-based web-app where users can post interesting URLs, and have spectators comment under each posting. Moreover, I use stealth banning against misbehaving users.
Interesting URLs are represented by Link
model; comments under the posting are reprenseted by Publicreply
model.
Here's how these models are related to one other:
class Link(models.Model):
submitter = models.ForeignKey(User)
submitted_on = models.DateTimeField(auto_now_add=True)
url = models.URLField(max_length=250)
class Publicreply(models.Model):
submitted_by = models.ForeignKey(User)
answer_to = models.ForeignKey(Link)
submitted_on = models.DateTimeField(auto_now_add=True)
description = models.TextField(validators=[MaxLengthValidator(250)])
Users
I ban go here:
class HellBanList(models.Model):
condemned = models.ForeignKey(User)
when = models.DateTimeField(auto_now_add=True)
What's my question? I'm trying to write the best performing, optimized query that gets me:
1. The publicreply
with the absolutely latest timestamp FROM among all distinct links
THAT either the user herself posted, OR below which the said user left a publicreply
2. But if the publicreply
object gotten in 1 was of someone who's part of HellBanList
(i.e. condemned attribute), I'd want to ignore that particular publicreply
and dig up the next most-recent one. If that's of a banned user too, I'll skip to the next (and so on).
3. Finally, I only need submitted_by
and submitted_on
attribute values from the publicreply
object I end up with after catering for both 1 and 2.
So what have I tried thus far? This:
freshest_link = Link.objects.filter(Q(submitter=self.request.user)|Q(publicreply__submitted_by=self.request.user)).distinct().annotate(date=Max('publicreply__submitted_on')).latest('date')
freshest_reply = Publicreply.objects.filter(answer_to=freshest_link).latest('submitted_by')
What's the problem with this code? The fact that it doesn't cater to requirement 2. I think I ought to have somehow run an exclude()
on publicreplys
posted by users
in HellBanList
before calculating freshest_reply
(in order to meet requirement 2). Anyhow I can't figure out how to do that. Secondly, an expert can perhaps build the whole query more efficiently than how I've attempted it, goal being minimum possible DB roundtrips.
Can you help out in this? If you feel I've been opaque in my description of what I'm trying to achieve, do ask for clarification.
Note: my DB is postgres
Get the banned users list like:
banlist = HellBanList.objects.values_list('condemned',flat=True)
modify freshest_reply query like:
freshest_reply = Publicreply.objects.filter(answer_to=freshest_link).exclude(submitted_by_id__in=banlist).latest('submitted_on')
2nd method:
In your query, first you get the Link
which has the latest public reply and in the next query you get the latest Publicreply
for that Link
. Instead of that we can just get the latest reply in one query like:
latest_reply = Publicreply.objects.filter(Q(answer_to__submitter=self.request.user)|Q(submitted_by=self.request.user)).exclude(submitted_by_id__in=banlist).latest('submitted_on')
Let me know if I missed something.