Search code examples
pythonsqlitegroup-bygroup-concathaving

Selecting a match across multiple rows sqlite


Let's say I have a table named food_prefs of friends and their favorite food(s).

id name favorite_foods
1 Amy Pizza
2 Bob Pizza
3 Chad Caviar
4 Dana Pizza
5 Dana Salad

I understand how to get the names of everyone who likes pizza, but I'm unclear on how to select the people who only have pizza listed as their favorite food. That is, from the above table, I only want to select Amy and Bob.

Additionally, it would be great to have a solution that can also select names with multiple favorites (e.g. in another query select everyone who has pizza and salad as their favorite, which would). Finally, it could be useful if the pizza and salad query not only returned people who only liked both foods, but also people who only had one favorite that appears in that list (e.g. people who just like pizza or just like salad — everyone but chad in this example)

(I find the sqlite documentation not the most straightforward, so sorry if this is as a very straightforward question!)


Solution

  • You are right (as I read in your comment below your question) that for your requirement you need aggregation and the conditions in the HAVING clause.

    For example, to get people who only have 'pizza' listed as their favorite food:

    SELECT name 
    FROM food_prefs 
    GROUP BY name
    HAVING GROUP_CONCAT(favorite_foods) = 'Pizza';
    

    But if you want to get people who only have 'pizza' and 'salad' listed as their favorite food, this:

    HAVING GROUP_CONCAT(favorite_foods) = 'Pizza,Salad'
    

    may not work, because SQLite does not support an ORDER BY clause in the function GROUP_CONCAT(), so it is not guaranteed that it will return 'Pizza,Salad' (it could also return 'Salad,Pizza').

    In this case you need a more complex HAVING clause:

    HAVING SUM(favorite_foods IN ('Pizza', 'Salad')) = 2
       AND SUM(favorite_foods NOT IN ('Pizza', 'Salad')) = 0
    

    or:

    HAVING SUM(favorite_foods IN ('Pizza', 'Salad')) = 2
       AND COUNT(*) = 2
    

    I assume that the combination of name and favorite_foods is unique.

    For the case of people who have 'pizza' and/or 'salad' listed as their favorite food, but they also may have other foods listed, you need only a WHERE clause and not aggregation:

    SELECT DISTINCT name    
    FROM food_prefs 
    WHERE favorite_foods IN ('Pizza', 'Salad');