I have 2 tables that can have many-to-many relations between them:
Person (pid, name, a,b) ,
Attributes (attribId, d,e)
The mapping is present in a separate table:
Mapping (mapId, pid, attribId)
The goal is to get all Person and Attributes values for a person who qualifies the filter criteria. The filter criteria is based on a column in the Attributes table. Eg - column d.
For example:
Person ->
(1,'person1','a1','b1')
(2,'person2','a1','b1')
Attributes ->
(1,'d1','e1')
(2,'d2','e1')
(3,'d3','e1')
(4,'d3','e2')
Mapping ->
(1,1,1)
(2,1,2)
(3,1,3)
After running the query ->
Result:
(1,'person1','a1','b1')(1,'d1','e1')
(1,'person1','a1','b1')(2,'d2','e1')
(1,'person1','a1','b1')(3,'d3','e1')
Query that i have been trying ->
select p.*, a.*
from
Person p
left outer join
Mapping m
on p.pid=m.pid
left outer join
Attributes a
on m.attribId=a.attribId
where
p.pid in (select p1.pid
from
Person p1
left outer join
Mapping m1
on p1.pid=m1.pid
left outer join
Attributes a1
on m1.attribId=a1.attribId
where
a1.d = 'd1')
Similarly, I also have to discard Person entries that have a certain d value.
So, currently, the final query looks like this:
SELECT
p.*,
a.*
FROM Person p
LEFT OUTER JOIN Mapping m
ON p.pid = m.pid
LEFT OUTER JOIN Attributes a
ON m.attribId = a.attribId
WHERE p.pid IN (SELECT
p1.pid
FROM Person p1
LEFT OUTER JOIN Mapping m1
ON p1.pid = m1.pid
LEFT OUTER JOIN Attributes a1
ON m1.attribId = a1.attribId
WHERE a1.d = 'd1')
AND p.pid NOT IN (SELECT
p2.pid
FROM Person p2
LEFT OUTER JOIN Mapping m2
ON p2.pid = m2.pid
LEFT OUTER JOIN Attributes a2
ON m2.attribId = a2.attribId
WHERE a2.d = 'd5');
It feels like this query is inefficient since the same join is done at 3 places. Is there a way to reuse the join for all the sub-queries and make this more efficient?
You can get all persons satisfying the filter using:
select m.pid
from mapping m join
attributes a
on m.attribId = a.attribId and a.d = 'dS';
You can get all person/attribute combinations using IN
or EXISTS
or a JOIN
. Which is better depends on the database. But the idea is:
select p.*, a.*
from person p join
mapping m
on p.pid = m.pid join
attributes a
on m.attribId = a.attribId
where p.pid in (select m.pid
from mapping m join
attributes a
on m.attribId = a.attribId and a.d = 'dS'
);
I see no reason to have left join
s for these queries.
EDIT:
If the filter criteria is based on multiple columns, then use group by
and having
for the subquery:
select m.pid
from mapping m join
attributes a
on m.attribId = a.attribId and a.d = 'dS'
group by m.pid
having sum(case when a.d = 'dS' then 1 else 0 end) > 0 and -- at least one of these
sum(case when a.d = 'd1' then 1 else 0 end) = 0; -- none of these