PostgreSQL 9.3.2, compiled by Visual C++ build 1600, 64-bit
Each customer can have many orders and referrals. Now, I want to create a view with some statistics for customer where, for each customer, I have some calculated columns (one row for each customer).
Create the view:
create view myview
select
a.customer_id,
sum(a.num) as num_orders,
sum(b.num) as num_referrals
from
(
select
customer.id as customer_id,
count(customer.id) as num
from
customer
left join
order
on
order.customer_id = customer.id
group by
customer.id
) a
left join
(
select
customer.id as customer_id,
count(customer.id) as num
from
customer
left join
referral
on
referral.customer_id = customer.id
group by
customer.id
) b
on
a.customer_id = b.customer_id
group by
a.customer_id,
b.customer_id
;
Query A (this is fast):
select
customer.*,
myview.*
from
customer
left join
myview
on
customer.id = myview.customer_id
where
customer.id = 100
;
Query B (this is SLOW):
select
customer.*,
myview.*
from
customer
left join
myview
on
customer.id = myview.customer_id
where
customer.sex = 'M'
;
Query C (this is fast):
select
customer.*,
myview.*
from
customer
left join
myview
on
customer.id = myview.customer_id
where
(select id from customer where sex = 'M')
;
OK, so why is Query B so much different than Query A in terms of performance? I guess, in Query B, it is running those subqueries first without filtering, but I don't know how to fix it.
The problem is that it is our ORM that is generating the query. So, I can't fix the problem by doing something like Query C.
I'm hoping there's just a better way to design my view to fix the problem. The main difference in the EXPLAIN
results between Query A and Query B is that Query B has some MERGE RIGHT JOIN
operations.
Any ideas?
EDIT:
I added following information per requests from people commenting. The following is the more true-to-life info (as opposed to the simplified, hypothetical scenario above).
create or replace view myview as
select
a.id_worder,
count(a.*) as num_finance_allocations,
count(b.*) as num_task_allocations
from
(
select
woi.id_worder,
count(*) as num
from
worder_invoice woi
left join
worder_finance_task ct
on
ct.id_worder_finance = woi.id
left join
worder_finance_task_allocation cta
on
cta.id_worder_finance_task = ct.id
group by
woi.id_worder
) a
left join
(
select
wot.id_worder,
count(*) as num
from
worder_task wot
left join
worder_task_allocation wota
on
wota.id_worder_task = wot.id
group by
wot.id_worder
) b
on
a.id_worder = b.id_worder
group by
a.id_worder,
b.id_worder
;
Query A (fast, apparently I need a rep of more than 10 to post more than 2 links, so no EXPLAIN for this one)
select
*
from
worder a
left outer join
myview b
on
a.id = b.id_worder
where
a.id = 100
;
Query B (SLOW, EXPLAIN)
select
*
from
worder a
left outer join
myview b
on
a.id = b.id_worder
where
a.id_customer = 200
Query C (fast, EXPLAIN)
select
*
from
worder a
left outer join
myview b
on
a.id = b.id_worder
where
a.id = (select id from worder where id_customer = 200)
;
Try rewriting your view like so:
create view myview
select
c.customer_id,
(
select count(*) from order o where o.customer_id=c.customer_id
) num_orders,
(
select count(*) from referral r where r.customer_id=c.customer_id
)
from customer c ;