I have a DB of survey data that looks like this:
create table Products(
Id int primary key,
Name nvarchar(max)
)
create table Opinions(
Id int primary key,
ProductId int foreign key references Products(Id),
IsEasyToUse bit default((0)),
IsGoodPrice bit default((0)),
IsDurable bit default((0)),
-- and so on...
)
I want to create a report that shows every product and a percentage of all the opinion columns from the Opinions
table.
So far, I have been doing it the naive way:
select
p.Name,
1.0 *
(select count(*) from Opinions o where o.ProductId = p.Id and o.IsEasyToUse = 1) /
(select count(*) from Opinions o where o.ProductId = p.Id) as EasyToUse,
1.0 *
(select count(*) from Opinions o where o.ProductId = p.Id and o.IsGoodPrice = 1) /
(select count(*) from Opinions o where o.ProductId = p.Id) as GoodPrice,
1.0 *
(select count(*) from Opinions o where o.ProductId = p.Id and o.IsDurable = 1) /
(select count(*) from Opinions o where o.ProductId = p.Id) as Durable
from
(select Id, Name from Products) p
This technically works, but I am not happy with the repetition of the conditions. Also the Opinions table is growing in size so the performance issues are starting to become noticeable (it now takes around a minute with 1m rows).
Is there a way to rewrite it in a way that's faster (and easier to modify in the future in case I want to apply more filters)?
As a starter: we can join and aggregate instead of executing multiples correlated subqueries.
Then: we don't need to compute the total count; instead, we can just juse avg()
; less calls to aggregate functions should improve the efficiency of the query:
select p.id, p.name
avg(case when IsEasyToUse = 1 then 1.0 else 0 end) as EasyToUse,
avg(case when IsGoodPrice = 1 then 1.0 else 0 end) as GoodPrice,
avg(case when IsDurable = 1 then 1.0 else 0 end) as Durable
from products p
inner join opinions o on o.productID = p.id
group by p.id, p.name
Depending on how your data is spread across the two tables, apply
might offer a good alternative, since it avoids outer aggregation:
select p.id, p.name, x.*
from products p
cross apply (
select
avg(case when IsEasyToUse = 1 then 1.0 else 0 end) as EasyToUse,
avg(case when IsGoodPrice = 1 then 1.0 else 0 end) as GoodPrice,
avg(case when IsDurable = 1 then 1.0 else 0 end) as Durable
from opinions o
where o.productID = p.id
) x