Search code examples
jsondjangoorm

How to aggregate (min/max etc.) over Django JSONField data?


I'm using Django 1.9 with its built-in JSONField and Postgres 9.4. In my model's attrs json field I store objects with some values, including numbers. And I need to aggregate over them to find min/max values. Something like this:

Model.objects.aggregate(min=Min('attrs__my_key'))

Also, it would be useful to extract specific keys:

Model.objects.values_list('attrs__my_key', flat=True)

The above queries fail with

FieldError: "Cannot resolve keyword 'my_key' into field. Join on 'attrs' not permitted."

Is it possible somehow?

Notes:

  1. I know how to make a plain Postgres query to do the job, but am searching specifically for an ORM solution to have the ability to filter etc.
  2. I suppose this can be done with a (relatively) new query expressions/lookups API, but I haven't studied it yet.

Solution

  • For those who interested, I've found the solution (or workaround at least).

    from django.db.models.expressions import RawSQL
    
    Model.objects.annotate(
        val=RawSQL("((attrs->>%s)::numeric)", (json_field_key,))
    ).aggregate(min=Min('val')
    

    Note that attrs->>%s expression will become smth like attrs->>'width' after processing (I mean single quotes). So if you hardcode this name you should remember to insert them or you will get error.

    /// A little bit offtopic ///

    And one more tricky issue not related to django itself but that is needed to be handled somehow. As attrs is json field and there're no restrictions on its keys and values you can (depending on you application logic) get some non-numeric values in, for example, width key. In this case you will get DataError from postgres as a result of executing the above query. NULL values will be ignored meanwhile so it's ok. If you can just catch the error then no problem, you're lucky. In my case I needed to ignore wrong values and the only way here is to write custom postgres function that will supress casting errors.

    create or replace function safe_cast_to_numeric(text) returns numeric as $$
    begin
        return cast($1 as numeric);
    exception
        when invalid_text_representation then
            return null;
    end;
    $$ language plpgsql immutable;
    

    And then use it to cast text to numbers:

    Model.objects.annotate(
        val=RawSQL("safe_cast_to_numeric(attrs->>%s)", (json_field_key,))
    ).aggregate(min=Min('val')
    

    Thus we get quite solid solution for such a dynamic thing as json.