Search code examples
sqlsqliterelational-division

Borrowers that take all loans using NOT EXISTS


I want to find the borrowers who took all loan types.

Schema:

loan (number (PKEY), type, min_rating)
borrower (cust (PKEY), no (PKEY))

Sample tables:

number | type     | min_rating
------------------------------
L1     | student  | 500
L2     | car      | 550
L3     | house    | 500
L4     | car      | 700
L5     | car      | 900

cust  | no 
-----------
Jim   | L2
Tom   | L1
Tom   | L2
Tom   | L3
Tom   | L4
Tom   | L5
Bob   | L3

The answer here would be "Tom".

I can simply count the total number of loans and compare the borrower's number of loans to that, but I'm NOT allowed to (this is a homework exercise), for the purposes of this homework and learning.

I wanted to use double-negation where I first find the borrowers who didn't take all the loans and find borrowers who are not in that set. I want to use nesting with NOT EXISTS where I first find the borrowers that didn't take all the loans but I haven't been able to create a working query for that.


Solution

  • A simple approach is to use the facts:

    • that an outer join gives you nulls when there's no join
    • coalesce() can turn a null into a blank (that will always be less that a real value)

    Thus, the minimum coalesced loan number of a person who doesn't have every loan type will be blank:

    select cust
    from borrower b
    left join loan l on l.number = b.no
    group by cust
    having min(coalesce(l.number, '')) > ''
    

    The group-by neatly sidesteps the problem of selecting people more than once (and the ugly subqueries that often requires), and relies on the quite reasonable assumption that a loan number is never blank. Even if that were possible, you could still find a way to make this pattern work (eg coalesce the min_rating to a negative number, etc).

    The above query can be re-written, possibly more readably, to use a NOT IN expression:

    select distinct cust
    from borrower
    where cust not in (
      select cust
      from borrower b
      left join loan l on l.number = b.no
      where l.number is null
    )
    

    By using the fact that a missed join returns all nulls, the where clause of the inner query keeps only missed joins.

    You need to use DISTINCT to stop borrowers appearing twice.


    Your schema has a problem - there is a many-to-many relationship between borrower and load, but your schema handles this poorly. borrower should have one row for each person, and another association table to record the fact that a borrower took out a loan:

    create table borrower (
        id int,
        name varchar(20)
        -- other columns about the person
    );
    
    create table borrrower_loan (
        borrower_id int, -- FK to borrower
        load_number char(2) -- FK to loan
    );
    

    This would mean you wouldn't need the distinct operator (left to you to figure out why), but also handles real life situations like two borrowers having the same name.