Search code examples
sqlpostgresqlrelational-division

Postgres exclusive tag search


I'm trying to return all rows which are associated with a user who is associated with ALL of the queried 'tags'. My table structure and desired output is below:

admin.tags:
user_id |   tag   |   detail   |    date
   2    |  apple  | blah...    | 2015/07/14
   3    |  apple  | blah.      | 2015/07/17
   1    |  grape  | blah..     | 2015/07/23
   2    |  pear   | blahblah   | 2015/07/23
   2    |  apple  | blah, blah | 2015/07/25
   2    |  grape  | blahhhhh   | 2015/07/28 

system.users:
id  |    email
 1  | joe@test.com
 2  | jane@test.com
 3  | bob@test.com

queried tags:
'apple', 'pear'

desired output:
user_id |   tag   |   detail   |    date    |  email
   2    |  apple  | blah...    | 2015/07/14 | jane@test.com
   2    |  pear   | blahblah   | 2015/07/23 | jane@test.com
   2    |  apple  | blah, blah | 2015/07/25 | jane@test.com

Since user_id 2 is associated with both 'apple' and 'pear' each of her 'apple' and 'pear' rows are returned, joined to system.users in order to also return her email.

I'm confused on how to properly set up this postgresql query. I have made several attempts with left anti-joins, but cannot seem to get the desired result.


Solution

  • The query in the derived table gets you the user ids for users that have all specified tags and the outer query gets you the details.

    select * 
    from "system.users" s
    join "admin.tags" a on s.id = a.user_id
    join (
        select user_id 
        from "admin.tags" 
        where tag in ('apple', 'pear')
        group by user_id 
        having count(distinct tag) = 2
    ) t on s.id = t.user_id;
    

    Note that this query would include users who have both tags that you search for but may have other too as long as they at least have the two specified.

    With your sample data the output would be:

    | id |         email | user_id |   tag |     detail |                   date | user_id |
    |----|---------------|---------|-------|------------|------------------------|---------|
    |  2 | jane@test.com |       2 | grape |   blahhhhh | July, 28 2015 00:00:00 |       2 |
    |  2 | jane@test.com |       2 | apple | blah, blah | July, 25 2015 00:00:00 |       2 |
    |  2 | jane@test.com |       2 |  pear |   blahblah | July, 23 2015 00:00:00 |       2 |
    |  2 | jane@test.com |       2 | apple |    blah... | July, 14 2015 00:00:00 |       2 |
    

    If you want to exclude the row with grape just add a where tag in ('apple', 'pear') to the outer query too.

    If you want only users that have only the searched for tags and none other (eg. exact division) you can change the query in the derived table to:

    select user_id 
    from "admin.tags" 
    group by user_id
    having sum(case when tag = 'apple' then 1 else 0 end) >= 1
       and sum(case when tag = 'pear' then 1 else 0 end) >= 1 
       and sum(case when tag not in ('apple','pear') then 1 else 0 end) = 0
    

    This would not return anything given your sample data as user 2 also has grape

    Sample SQL Fiddle