Search code examples
djangopostgresqldjango-modelsdjango-querysetdjango-aggregation

Calculate average exchange rate for time period


In Django I have model similar to this example:

class Currency(models.Model):
    name = models.CharField(max_length=3, unique=True)
    full_name = models.CharField(max_length=20)


class ExchangeRate(models.Model):
    currency = models.ForeignKey('Currency')
    start_date = models.DateFiled()
    end_date = models.DateField()
    exchange_rate = models.DecimalField(max_digits=12, decimal_places=4)

Let's simplify this and assume that we have only one currency and ExchangeRate table looks like:

+---------------------+-------------------+------------+------------+---------------+
| currency_from__name | currency_to__name | start_date |  end_date  | exchange_rate |
+---------------------+-------------------+------------+------------+---------------+
|        PLN          |        USD        | 2014-03-01 | 2014-08-01 |    3.00000    |
|        PLN          |        USD        | 2014-08-01 | 2014-12-01 |    6.00000    |
+---------------------+-------------------+------------+------------+---------------+

Note that this is example to simplify math operations!

In this table data density is once per month and valid record for one month is for example when start_date = 2014.03.01 and end_date = 2014.04.01, so start_date is inclusive and end_date is exclusive.

I want to calculate average exchange rate for time period:

[2014.06.01 ; 2012.09.01)

Which means: >= 2014.06.01 and < 2014.09.01

When in Django I write:

start_date = date(2014, 6, 1)
end_date = date(2014, 9, 1)

ExchangeRate.objects.all().filter(
        (
            Q(start_date__lt=start_date) & 
            Q(end_date__gt=start_date)
        ) | (
            Q(start_date__gte=start_date) & 
            Q(start_date__lt=end_date) & 
            Q(end_date__gt=start_date) 
        )
).annotate(
    currency_from_name = 'currency_from__name', 
    currency_to_name = 'currency_to__name'
).values(  # GROUP BY
    'currency_from_name',
    'currency_to_name'
).aggregate(
    F('currency_from_name'), 
    F('currency_to_name'), 
    Avg('exchange_rate')
)

After this query I'm receiving value 4.5000 which from mathematical reason is correct but wrong when You need to take care of time range.
Correct answer is 4.000.

I only came up with this solution to annotate extra column with this formula and then calculate average value from this column:

https://www.codecogs.com/eqnedit.php?latex=\inline&space;Abs&space;\left&space;(&space;\frac{months&space;\left&space;(&space;greater(ER_{start_date}\&space;,\&space;start_date),&space;smaller(ER_{start_date}\&space;,\&space;end_date)&space;\right&space;)&space;}{months(start_date\&space;,\&space;end_date)}&space;\right&space;)&space;*&space;ER_{exchange_rate}

Where:

  • Abs is function for absolute value abs()
  • months is function to calculate months between two dates months_between()
  • greater, smaller are functions to choose accordingly greater and smaller value from arguments - greatest(), least()
  • ER means column from ExchangeRate - e.g. F('exchange_rate')

I'm using 9.3 PostgreSQL DB and Django 1.8.4.

Maybe there is a simple function for that?
Maybe I'm overcomplicating this?


Solution

  • 1. months_between():

    create function months_of(interval)
     returns int strict immutable language sql as $$
      select extract(years from $1)::int * 12 + extract(month from $1)::int
    $$;
    
    create function months_between(date, date)
     returns int strict immutable language sql as $$
       select months_of(age($1, $2))
    $$;
    

    2. average_weight():

    create function average_weight(numeric, date, date, date, date)
     returns numeric(9,2) strict immutable language sql as $$
       select abs(months_between(GREATEST($2, $4), LEAST($3, $5))/months_between($4, $5))*$1
    $$;
    

    3. AverageWeight:

    from django.db.models.aggregates import Func
    from django.db.models.fields import FloatField
    
    class AverageWeight(Func):
        function = 'average_weight'
    
        def __init__(self, *expressions):
            super(AverageWeight, self).__init__(*expressions, output_field=FloatField())
    

    In your view:

    ExchangeRate.objects.all().filter(
            (
                Q(start_date__lt=start_date) & 
                Q(end_date__gt=start_date)
            ) | (
                Q(start_date__gte=start_date) & 
                Q(start_date__lt=end_date) & 
                Q(end_date__gt=start_date) 
            )
    ).annotate(
        currency_from_name = 'currency_from__name', 
        currency_to_name = 'currency_to__name',
        weight_exchange = AverageWeight(
            F('exchange_rate'),
            start_date,
            end_date,
            F('start_date'),
            F('end_date'),
        )
    ).values(  # GROUP BY
        'currency_from_name',
        'currency_to_name'
    ).aggregate(
        F('currency_from_name'), 
        F('currency_to_name'), 
        Avg('weight_exchange')
    )