Search code examples
mysqlsqlrelational-division

SQL Query: How to improve?


I have two tables in a MySQL database:

Book(title, publisher, year) title is primary key

Author(name, title) title is foreign key to Book

I am trying to select the name of the Authors that published a book each year from 2000 to 2005 inclusive. This SQL query works, but is there a way to do this that makes it easier to change the date range if needed?

SELECT DISTINCT name
FROM Author
WHERE name IN  (SELECT Author.name 
                FROM Author INNER JOIN Book ON (Author.title = Book.title) 
                WHERE year = 2000)
        AND name IN
                (SELECT Author.name 
                FROM Author INNER JOIN Book ON (Author.title = Book.title)
                WHERE year = 2001)
        AND name IN
                (SELECT Author.name 
                FROM Author INNER JOIN Book ON (Author.title = Book.title)
                WHERE year = 2002)
        AND name IN
                (SELECT Author.name 
                FROM Author INNER JOIN Book ON (Author.title = Book.title)
                WHERE year = 2003)
        AND name IN
                (SELECT Author.name 
                FROM Author INNER JOIN Book ON (Author.title = Book.title)
                WHERE year = 2004)
        AND name IN
                (SELECT Author.name 
                FROM Author INNER JOIN Book ON (Author.title = Book.title)
                WHERE year = 2005);

Solution

  • Here's two ways to do it, and how another approach is wrong due to a subtle fault.

    SQL Fiddle

    MySQL 5.5.32 Schema Setup:

    create table Book (title varchar(10), year int) ;
    create table Author (name varchar(10), title varchar(10));
    
    insert Book values 
    ('Book1',2000),('Book2',2000),
    ('Book3',2000),('Book4',2000),
    ('Book5',2000),('Book6',2000),
    ('Book7',2001),('Book8',2002),
    ('Book9',2003),('Book10',2004),
    ('Book11',2005);
    
    insert into Author values 
    ('Author1','Book1'),('Author1','Book2'),
    ('Author1','Book3'),('Author1','Book4'),
    ('Author1','Book5'),('Author1','Book6'),
    ('Author2','Book6'),('Author2','Book7'),
    ('Author2','Book8'),('Author2','Book9'),
    ('Author2','Book10'),('Author2','Book11');
    
    # author1 has written 6 books in one year
    # author2 has written 1 book in every of the six years
    

    Query 1:

    # incorrect as it matches author1 who has 6 books in a single year
    SELECT name from Author 
    INNER JOIN BOOK on Author.title = Book.Title 
    WHERE year IN (2000,2001,2002,2003,2004,2005) 
    GROUP BY name 
    HAVING COUNT(name) = 6
    

    Results:

    |    NAME |
    |---------|
    | Author1 |
    | Author2 |
    

    Query 2:

    # correct as it counts distinct years
    SELECT name from Author 
    INNER JOIN BOOK on Author.title = Book.Title 
    WHERE year IN (2000,2001,2002,2003,2004,2005) 
    GROUP BY name 
    HAVING COUNT(DISTINCT year) = 6
    

    Results:

    |    NAME |
    |---------|
    | Author2 |
    

    Query 3:

    # correct using relational division
    SELECT DISTINCT name
    FROM Author A1
    INNER JOIN Book B1 ON A1.title = B1.Title
    WHERE NOT EXISTS (
        SELECT * 
        FROM Book B2
        WHERE year IN (2000,2001,2002,2003,2004,2005) 
        AND NOT EXISTS (
           SELECT *
           FROM Author A2
           INNER JOIN Book B3 ON A2.title = B3.Title
           WHERE (A1.name = A2.name)
            AND (B3.year = B2.year)
        )
    )
    

    Results:

    |    NAME |
    |---------|
    | Author2 |