Is it possible to return querysets that return only one object per foreign key?
For instance, I want the to get the latest comments from django_comments, but I only want one comment (the latest comment) per object, i.e., only return the latest comment on an object and exclude all the past comments on that object. I guess this would be similar to a sql group_by on django_comments.content_type and django_comments.object_pk.
The end goal is to create a list of active comment "threads" displayed/ordered by which thread has the most recent comment, just like your standard discussion board whose topics are listed by recent activity.
I figure the best way to do this would be grabbing the latest comments, and then sorting or grouping them by content type and object_pk so that only one comment (the latest) is returned per related content object. I can then use that comment to get all the info I need, so the word thread is used loosely since I'm really just grabbing a comment and following it's pk's.
The MODEL is django_threadedcomments which extends django_comments with some added fields for trees, children, and parents.
...this returns all comments including all instances of parent
comments = ThreadedComment.objects.all().exclude(is_public='0').order_by("-submit_date")
...and this is ideal
comments = ThreadedComment.objects.all().exclude(is_public='0').order_by("submit_date").[plus sorting logic to exclude multiple instances of the same object_pk and content_type]
{% for comment in comments %}
TITLE: {{comment.content_object.title}}
STARTED BY : {{comment.content_object.user}}
MOST RECENT REPLY : {{comment.user}} on {{comment.submit_date}}
{% endfor %}
Thanks again!
This is a fairly difficult thing to do in SQL at all; you probably won't be able to do it through the ORM.
You can't use GROUP BY for this. That's used for telling SQL how to group items for aggregation, which isn't what you're doing here. "SELECT x, y FROM table GROUP BY x" is illegal SQL, because the value of y is meaningless.
Let's look at this with a clear schema in mind:
CREATE TABLE comments ( object_id INTEGER REFERENCES objects (id), text VARCHAR NOT NULL, date TIMESTAMP NOT NULL );
INSERT INTO objects (id, name) VALUES (1, 'object 1'), (2, 'object 2');
INSERT INTO comments (object_id, text, date) VALUES
(1, 'object 1 comment 1', '2010-01-02'),
(1, 'object 1 comment 2', '2010-01-05'),
(2, 'object 2 comment 1', '2010-01-08'),
(2, 'object 2 comment 2', '2010-01-09');
SELECT * FROM objects o JOIN comments c ON ( = c.object_id);
The most elegant way I've seen for doing this is Postgresql 8.4's windowing functions.
o.*, c.*,
rank() OVER (PARTITION BY object_id ORDER BY date DESC) AS r
FROM objects o JOIN comments c ON ( = c.object_id)
) AS s
WHERE r = 1;
That'll select the first comment for each object by date, newest first. If you don't see what this is doing, execute the inner SELECT on its own and watch how it generates rank(), which makes it pretty straightforward.
I know other ways of doing this with Postgresql, but I don't know how to do this in other databases.
Trying to compute this dynamically is likely to give you serious headaches--and it takes more work to make these complex queries perform well, too. Chances are you're better off doing this the simple way: store a last_comment_id
field for each object and update it when a comment is added or deleted, so you can just join and sort. You could probably use SQL triggers to handle this updating automatically.