Search code examples
pythondjangodjango-querysetdjango-piston

Limit queryset results based on model field, are there better ways?


basically i want to get the latest 30 log entries for every Host. Currently i'm doing this in django-piston.

def read(self,request):
    val={}
    for x in Host.objects.all():
        val[x.uuid_id]=DataLog.objects.filter(host=x).order_by('-time')[:30]           
    return val

Unfortunately this request takes way to long (currently 1s for about 10k database entries). Are there more efficient ways to do this?

Harper


Solution

  • If you are using PostgreSQL as a database backend and don't need cross database compatibility, you can use the powerful window functions that allow something like that :

    Imagine your table looks like that :

    CREATE TABLE x (
        i serial primary key,
        value integer not null,
        date timestamp,
        category text);
    

    And you want the most recent value for each category. You will do :

    SELECT
        first_value(i) over w,
        first_value(value) over w,
        first_value(date) over w
        category,
    FROM x
    WINDOW w AS (PARTITION BY category ORDER BY date DESC);
    

    You can use such a query in django via the raw method on the query set manager :

    ModelX.objects.raw("""SELECT DISTINCT ....... FROM x WINDOW w .....""")
    

    To get the last N entries by category, the query is a little more complicated and involves a subquery :

    SELECT i, value, date, category
    FROM (SELECT
            i, value, date, category,
            row_number() over w
        FROM x
        WINDOW w AS (PARTITION BY category ORDER BY date DESC)) AS subquery
    WHERE subquery.row_number <= 30;
    

    Seeing that, you could even make a view :

    CREATE VIEW x_with_reverse_date_index AS
        (SELECT
            i, value, date, category,
            row_number() over w
        FROM x
        WINDOW w AS (PARTITION BY category ORDER BY date DESC));
    

    And create a django model that queries this view :

    class ModelX(models.Model):
        ...
        ...
        row_number = models.IntegerField("Row number when ordering by date desc")
    
        class Meta:
            db_table = 'x_with_reverse_date_index'
    

    and query it "normally" :

    ModelX.objects.filter(category__in = ('catA','catB'), row_number__lte = 30)
    ModelX.objects.filter(row_number = 29)
    ...
    

    Warning : again, don't do this if you need a code that works on another database engine.