Search code examples
sqlsqlitejoingroup-byhaving

Find keys in a sqlite database that meet intersection of conditions


It's hard for me to summarize the query I want to make, so maybe an example will make it clearer.

Let's say I have two primary tables:

employees:

| employee_id | employee_name |
| ----------- | ------------- |
| 1           | Alice         |
| 2           | Bob           |
| 3           | Carol         |

programming_languages:

| language_id | language_name |
| ----------- | ------------- |
| 1           | Python        |
| 2           | C++           |
| 3           | JavaScript    |

And a third table, with foreign keys from the first two, that says which employees know which programming languages:

employee_known_languages:

| employee_id | language_id |
| ----------- | ----------- |
| 1           | 1           |
| 2           | 1           |
| 2           | 2           |
| 2           | 3           |
| 3           | 3           |

So in this example Alice knows Python, Bob knows Python C++ and JavaScript, and Carol knows JavaScript.

If I wanted to write a query to get all the employees who know C++, I know what I'd want to do:

SELECT employee_name from
    employees INNER JOIN programming_languages INNER JOIN employee_known_languages
WHERE language_name = C++

What if I wanted to get all the employees who know C++ and Python? Or, more generally, some arbitrary subset of the values of language_name?

Ideally I'd be able to write this in a way where I can write a Python function that accepts a list of arguments and does the query, but I think just being able to figure out the SQL itself would help.


Solution

  • Use a CTE that returns an arbitrary subset of the values of language_name and filter the results of the joins of the 3 tables for these language_names only.
    Then use aggregation and set the condition in the HAVING clause:

    WITH cte(language_name) AS (VALUES ('C++'), ('Python'))
    SELECT e.* 
    FROM employees e
    INNER JOIN employee_known_languages ep ON ep.employee_id = e.employee_id
    INNER JOIN programming_languages p ON p.language_id = ep.language_id
    WHERE p.language_name IN cte
    GROUP BY e.employee_id
    HAVING COUNT(*) = (SELECT COUNT(*) FROM cte);
    

    See the demo.