In a Ruby on Rails project, 4.0, I have a Payments
model that has_many
Transactions
model.
If I have the following data
SELECT * FROM payments;
id | paid_amount
---+--------------
1 | 200
2 | 300
3 | 100
4 | 400
5 | 100
6 | 600
7 | 100
8 | 100
9 | 800
SELECT * FROM transactions;
id | payment_id | type
---+-------------+------
1 | 2 | cash
2 | 3 | credit
3 | 1 | credit
4 | 4 | cash
5 | 1 | cash
6 | 6 | credit
7 | 1 | cash
8 | 1 | credit
9 | 8 | cash
Right now, I'm calculating the sum of an numeric attribute paid_amount
from Payments
, that has a transaction of certain type, as follows
> Payment.where(id: Transaction.where(type: 'cash').pluck(:payment_id)).sum(:paid_amount)
> 1000
> # The sum of paid_amount from payments with ids 1, 2, 4 and 8
But this doesn't work fast enough with many thousands of records so I tried to accomplish this with includes
without any luck.
> Payment.includes(:transactions).where("transactions.type = 'cash'").sum(:paid_amount)
> 1200
> # paid_amount of payment with id 1 is considered two times because of transactions with id 5 and 7
Any thoughts on how should I calculate the number I need?
Your first query:
Payment.where(id: Transaction.where(type: 'cash').pluck(:payment_id)).sum(:paid_amount)
is slow partly because Transaction.where(type: 'cash').pluck(:payment_id)
is a query that returns an array and then you send that array back to the database with an IN (big_list_of_integers)
in the SQL. A slight modification will use a subquery instead, just switch the pluck
to select
:
Payment.where(id: Transaction.where(type: 'cash').select(:payment_id)).sum(:paid_amount)
that will result in one query like:
select sum(payments.paid_amount)
from payments
where payments.id in (
select payment_id
from transactions
where transactions.type = ...
)
being sent to the database. This should be quite a bit faster with large lists.