Backstory (can skip): I have a database with records of events. Events for each label occurred around a certain period in the year. I would like to find when in the year, more or less, that group of events occurred. Therefore I planned to calculate the average timestamp per group, and do this efficiently with postgres, instead of fetching all the timestamps and calculating locally.
The question: I'm trying to average a timestamp without time zone
postgres field with django's excellent ORM like so:
from django.db.models import Avg
ModelName.objects.filter(a_field='some value').aggregate(Avg('time'))
However I'm getting:
function avg(timestamp without time zone) does not exist
LINE 1: SELECT AVG("model_name"."time") AS "time__avg" FROM "m...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
Is there a way to do this with django's ORM?
If not, how then do I workaround?
I had a similar problem, where I wanted to find the average time taken to vote for a particular item. But postgres wouldn't allow taking the average of datetimes. Doing so would result in the following error:
django.db.utils.ProgrammingError: function avg(timestamp with time zone) does not exist
LINE 1: SELECT "votes_item"."name", AVG("votes_vote"."datetime") AS ...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
To make this simpler consider the following tables, where the Vote has foreign key relation to Item:
The Item table:
id: pK | name: char | datetime(time at which the item was inserted) |
---|---|---|
1 | Apple | 22-06-23 11:25:33 |
2 | Orange | 22-06-22 01:22:18 |
The Vote table:
id: pK | user: Fk (user who voted for the item) | item: Fk (the item the user voted) | vote: (1 for positive vote and -1 negative vote) | datetime (time at which the item was voted) |
---|---|---|---|---|
1 | 1 | 1 | 1 | 2022-06-22 11:26:18 |
2 | 3 | 1 | 1 | 2022-06-21 12:26:36 |
3 | 2 | 1 | 1 | 2022-06-26 01:20:59 |
I wanted to know the average time at which users voted for each item. For eg: all the avg time taken by users to vote Apple (i.e annotate)
Since postgres avg function doesn't directly take in datetime, first convert it to seconds then take the average and convert it back to datetime.
To make things simpler create two class as shown below.
from django.db import models
class Epoch(models.expressions.Func):
template = 'EXTRACT(epoch FROM %(expressions)s)::FLOAT'
output_field = models.FloatField()
class DateTimeFromFloat(models.expressions.Func):
template = 'To_TIMESTAMP(%(expressions)s)::TIMESTAMP at time zone \'UTC\''
output_field = models.DateTimeField()
read more about Func
in this excellent answer
Now I wanted to get the average time at which each item was voted positive
So I use
Item.objects.filter(vote__vote=1).annotate(avg_time=DateTimeFromFloat(Avg(Epoch('vote__datetime')))).values('avg_time', 'name')
important part:
annotate(avg_time=DateTimeFromFloat(Avg(Epoch('vote__datetime'))))
output
<QuerySet [{'name': 'Apple', 'avg_time': datetime.datetime(2022, 6, 23, 8, 24, 37, 666667, tzinfo=datetime.timezone.utc)}]>
You can perform a similar operation using aggregate
.