I'm trying to return all rows which are associated with a user who is associated with ALL of the queried 'tags'. My table structure and desired output is below:
admin.tags:
user_id | tag | detail | date
2 | apple | blah... | 2015/07/14
3 | apple | blah. | 2015/07/17
1 | grape | blah.. | 2015/07/23
2 | pear | blahblah | 2015/07/23
2 | apple | blah, blah | 2015/07/25
2 | grape | blahhhhh | 2015/07/28
system.users:
id | email
1 | joe@test.com
2 | jane@test.com
3 | bob@test.com
queried tags:
'apple', 'pear'
desired output:
user_id | tag | detail | date | email
2 | apple | blah... | 2015/07/14 | jane@test.com
2 | pear | blahblah | 2015/07/23 | jane@test.com
2 | apple | blah, blah | 2015/07/25 | jane@test.com
Since user_id 2 is associated with both 'apple' and 'pear' each of her 'apple' and 'pear' rows are returned, joined to system.users
in order to also return her email.
I'm confused on how to properly set up this postgresql query. I have made several attempts with left anti-joins, but cannot seem to get the desired result.
The query in the derived table gets you the user ids for users that have all specified tags and the outer query gets you the details.
select *
from "system.users" s
join "admin.tags" a on s.id = a.user_id
join (
select user_id
from "admin.tags"
where tag in ('apple', 'pear')
group by user_id
having count(distinct tag) = 2
) t on s.id = t.user_id;
Note that this query would include users who have both tags that you search for but may have other too as long as they at least have the two specified.
With your sample data the output would be:
| id | email | user_id | tag | detail | date | user_id |
|----|---------------|---------|-------|------------|------------------------|---------|
| 2 | jane@test.com | 2 | grape | blahhhhh | July, 28 2015 00:00:00 | 2 |
| 2 | jane@test.com | 2 | apple | blah, blah | July, 25 2015 00:00:00 | 2 |
| 2 | jane@test.com | 2 | pear | blahblah | July, 23 2015 00:00:00 | 2 |
| 2 | jane@test.com | 2 | apple | blah... | July, 14 2015 00:00:00 | 2 |
If you want to exclude the row with grape
just add a where tag in ('apple', 'pear')
to the outer query too.
If you want only users that have only the searched for tags and none other (eg. exact division) you can change the query in the derived table to:
select user_id
from "admin.tags"
group by user_id
having sum(case when tag = 'apple' then 1 else 0 end) >= 1
and sum(case when tag = 'pear' then 1 else 0 end) >= 1
and sum(case when tag not in ('apple','pear') then 1 else 0 end) = 0
This would not return anything given your sample data as user 2 also has grape