My Model is sort of like
class ServiceUtilization(models.Model):
device_name = models.CharField()
service_name = models.CharField()
data_source = models.CharField()
current_value = models.CharField()
sys_timestamp = models.IntegerField()
Now, here current_value
represents the value in Float stored as VarChar, w.r.t the time stored as unixtime
While trying to get Max and Average values of current_value
I am getting unexpected results, because for Max, MySQL would do a string based comparision, where in '100' value < '9.99'
which in incorrect w.r.t values taken in Float.
I tried :
perf = ServiceUtilization.objects.filter(
device_name__in=devices,
service_name__in=services,
data_source__in=data_sources,
sys_timestamp__gte=start_date,
sys_timestamp__lte=end_date
).values(
'device_name',
'service_name',
'data_source'
).annotate(
max_val=Max('current_value'),
avg_val=Avg('current_value')
)
It provides the incorrect results.
Then looking at : HOW select min from cast varchar to int in mysql
I thought about providing query set with extra
perf = ServiceUtilization.objects.extra(
select={
'max_val': "MAX(CAST(current_value AS SIGNED))",
'avg_val': "AVG(CAST(current_value AS SIGNED))"
}
).filter(
device_name__in=devices,
service_name__in=services,
data_source__in=data_sources,
sys_timestamp__gte=start_date,
sys_timestamp__lte=end_date
).values(
'device_name',
'service_name',
'data_source',
'max_val',
'avg_val'
)
But this just provides a single value and not desired results. This translates to SQL as
SELECT (MAX(CAST(current_value AS SIGNED))) AS `max_val`, (AVG(CAST(current_value AS SIGNED))) AS `avg_val`, `performance_utilizationstatus`.`device_name`, `performance_utilizationstatus`.`service_name`, `performance_utilizationstatus`.`data_source`
FROM performance_utilizationstatus
ORDER BY performance_utilizationstatus
.sys_timestamp
DESC;
But the working code would require a GROUP BY on (device_name, service_name, data_source)
SELECT (MAX(CAST(current_value AS SIGNED))) AS `max_val`, (AVG(CAST(current_value AS SIGNED))) AS `avg_val`, `performance_utilizationstatus`.`device_name`, `performance_utilizationstatus`.`service_name`, `performance_utilizationstatus`.`data_source` FROM `performance_utilizationstatus`
GROUP BY performance_utilizationstatus
.device_name
, performance_utilizationstatus
.service_name
,
performance_utilizationstatus
.data_source
ORDER BY performance_utilizationstatus
.sys_timestamp
DESC;
How to add the GROUP BY CLAUSE ?
Using annotate
would not work here
1111, 'Invalid use of group function'
or
ERROR 1056 (42000): Can't group on 'max_val'
Would RAW SQL be the last resort here ?
I think you must go with .raw
because using .extra
wont be possible here.
The problem is because Django
don't have .group_by
the only way go group by something is to use .values
and .annotate
after that. (as you have done it in the first attempt)
so.. why you can't use .extra
? Because:
Any extra() call made after a values() call will have its extra selected fields ignored.
and
If you use a values() clause after an extra() call, any fields defined by a select argument in the extra() must be explicitly included in the values() call.
so the only way to get the .extra
fields is to add them into .values
but this will cause to group by this fields which is an undesired behavior.