Search code examples
sqliterelational-division

Return distinct pairs of names which have the same exact items in column


I want to find the distinct pairs of names in the table which have the same exact items in the items column. For instance:

CREATE TABLE t
(
    name    VARCHAR(255),
    item    VARCHAR(255)
);

INSERT INTO t VALUES("Alice", "Orange");
INSERT INTO t VALUES("Alice", "Pear");
INSERT INTO t VALUES("Alice", "Lemon");
INSERT INTO t VALUES("Bob", "Orange");
INSERT INTO t VALUES("Bob", "Pear");
INSERT INTO t VALUES("Bob", "Lemon");
INSERT INTO t VALUES("Charlie", "Pear");
INSERT INTO t VALUES("Charlie", "Lemon");

The answer here would be Alice,Bob because they took the exact same items.

I want to do it with double negation (using NOT EXISTS/NOT IN) only which I think is more well-suited to this question, but I couldn't come up with anything that is remotely close to being functional.

This is somewhat similar to this question but I'm using SQLite so I cannot use GROUP_CONCAT() but I was wondering how it would be done using relational division using NOT EXISTS/NOT IN.


Solution

  • With compound queries:

    SELECT t1.name, t2.name
    FROM t AS t1, t AS t2
    GROUP BY t1.name, t2.name
    HAVING t1.name < t2.name
       AND NOT EXISTS (SELECT item FROM t WHERE name = t1.name
                       EXCEPT
                       SELECT item FROM t WHERE name = t2.name)
       AND NOT EXISTS (SELECT item FROM t WHERE name = t2.name
                       EXCEPT
                       SELECT item FROM t WHERE name = t1.name);
    

    Using NOT IN is possible, bit expresses exactly the same mechanism with more complexity:

    SELECT t1.name, t2.name
    FROM t AS t1, t AS t2
    GROUP BY t1.name, t2.name
    HAVING t1.name < t2.name
       AND NOT EXISTS (SELECT item
                       FROM t
                       WHERE name = t1.name
                         AND item NOT IN (SELECT item
                                          FROM t
                                          WHERE name = t2.name))
       AND NOT EXISTS (SELECT item
                       FROM t
                       WHERE name = t2.name
                         AND item NOT IN (SELECT item
                                          FROM t
                                          WHERE name = t1.name));