SELECT
business_period,
SUM(transaction.transaction_value) AS total_transaction_value,
SUM(transaction.loss_value) AS total_loss_value,
(total_transaction_value - total_loss_value) AS net_value
FROM transaction
GROUP BY business_period
The above does not work as total_transaction_value
and total_loss_value
are not from the transaction
table. Is there a way to make this query work?
Note: this query involves 500 million rows, so need to efficient.
Question:
Some answers have suggested that SUM(transaction.transaction_value) - SUM(transaction.loss_value)
is cached and won't need to be computed again where as others are suggesting that I should as a derived table / subsequery to avoid repeated computation. Could someone point to something that could settle the difference in opinion?
I am using postgres 9.3.
ANSWER:
I want to quote erwin's comment here:
I ran a quick test with 40k rows and the winner was the plain version without subquery. CTE was slowest. So I think my first assumption was wrong and the query planner understands not to calculate the sums repeatedly (makes sense, too). I have seen different results with more complex expressions in the past. The planner does get smarter with every new version
Use:
SELECT
business_period,
SUM(transaction.transaction_value) AS total_transaction_value,
SUM(transaction.loss_value) AS total_loss_value,
(SUM(transaction.transaction_value) - SUM(transaction.loss_value)) AS net_value
FROM transaction
GROUP BY business_period