basically i want to get the latest 30 log entries for every Host. Currently i'm doing this in django-piston.
def read(self,request):
val={}
for x in Host.objects.all():
val[x.uuid_id]=DataLog.objects.filter(host=x).order_by('-time')[:30]
return val
Unfortunately this request takes way to long (currently 1s for about 10k database entries). Are there more efficient ways to do this?
Harper
If you are using PostgreSQL as a database backend and don't need cross database compatibility, you can use the powerful window functions that allow something like that :
Imagine your table looks like that :
CREATE TABLE x (
i serial primary key,
value integer not null,
date timestamp,
category text);
And you want the most recent value for each category. You will do :
SELECT
first_value(i) over w,
first_value(value) over w,
first_value(date) over w
category,
FROM x
WINDOW w AS (PARTITION BY category ORDER BY date DESC);
You can use such a query in django via the raw
method on the query set manager :
ModelX.objects.raw("""SELECT DISTINCT ....... FROM x WINDOW w .....""")
To get the last N entries by category, the query is a little more complicated and involves a subquery :
SELECT i, value, date, category
FROM (SELECT
i, value, date, category,
row_number() over w
FROM x
WINDOW w AS (PARTITION BY category ORDER BY date DESC)) AS subquery
WHERE subquery.row_number <= 30;
Seeing that, you could even make a view :
CREATE VIEW x_with_reverse_date_index AS
(SELECT
i, value, date, category,
row_number() over w
FROM x
WINDOW w AS (PARTITION BY category ORDER BY date DESC));
And create a django model that queries this view :
class ModelX(models.Model):
...
...
row_number = models.IntegerField("Row number when ordering by date desc")
class Meta:
db_table = 'x_with_reverse_date_index'
and query it "normally" :
ModelX.objects.filter(category__in = ('catA','catB'), row_number__lte = 30)
ModelX.objects.filter(row_number = 29)
...
Warning : again, don't do this if you need a code that works on another database engine.