Search code examples
sqlpostgresqlgroup-bymany-to-many

Filter rows to return the exact relationship


I have two tables, expenses and categories, they have a many-to-many relationship through the table expenses_categories. I'm trying to implement a filter by categories, lets say that I provided the id for the categories A and B, I want to return the expenses who only have A and B. For example:

  • Expense X have Category A, B, and C
  • Expense Y have Category A and B
  • Expense Z have Category B

I want to return only the Expense Y

I'm using PostgreSQL by the way. I really need to learn how to do this kind of stuff.

Categories

ID NAME
1 TV
2 CC
3 NET

ExpensesCategories

expense_id category_id
1 1
1 2
2 1
2 2
2 3
3 1
4 2

I want to get all the Expenses that have ONLY the Categories 1 and 2.
In that case, I expect to only get the Expense 1

expense_id category_id
1 1
1 2

Solution

  • You can group by expense_id and use STRING_AGG() in the HAVING clause to collect all the category_ids of each expense_id and compare it to a string like '1,2' which contains the category_ids that you want in ascending order as a comma separated list:

    SELECT expense_id
    FROM ExpensesCategories
    GROUP BY expense_id
    HAVING STRING_AGG(category_id::text, ',' ORDER BY category_id) = '1,2';
    

    If you want all the rows of these expense_ids in ExpensesCategories, use the above query as a CTE:

    WITH cte AS (
      SELECT expense_id
      FROM ExpensesCategories
      GROUP BY expense_id
      HAVING STRING_AGG(category_id::text, ',' ORDER BY category_id) = '1,2'
    )
    SELECT * 
    FROM ExpensesCategories
    WHERE expense_id IN (SELECT expense_id FROM cte);
    

    See the demo.