Search code examples
mysqljoinindexinggroup-byhaving

Which index(es) should be used in these queries?


I have a question about SQL indexing in my theory class. It asks me to choose which column should be indexed to optimize these 2 queries: (This is 'paper' question so I am not provided the database to test these indexes with EXPLAIN)

1, First query.

SELECT BranchName, No_Of_Copies 
FROM BOOK as B, BOOK_COPIES as BC, LIBRARY_BRANCH as LB
WHERE B.BookId = BC.BookId and BC.BranchId=LB.BranchId and title ="The Lost Tribe";

I have the answer of this one, which is that BOOK.title, BOOK_COPIES.BranchId and LIBRARY_BRANCH.BranchId should be used for indexing. However, I don't really understand why BOOK.BookId and BOOK_COPIES.BookId are not chosen for indexing.

2, Second query

SELECT B.cardNo, Name, Address, COUNT(BookId,BranchId) 
FROM BORROWER as B, BOOK_LOANS as BL
WHERE (BL.CardNo=B.CardNo)
GROUP BY B.CardNo, Name, Address
HAVING COUNT(BL.BookId, BranchId)>5;

Would it be optimized if I create index on BOOK_LOANS.CardNo, BORROWER.CardNo, Name and Address ?


Solution

  • That class needs to be updated. Using commas in JOIN is antiquated; the new style uses JOIN .. ON

    The question is ambiguous -- what table contains title? I'll assume it is B.

    1. Since the only filtering is on title, The Optimizer will pick B as the first table to look at:

       B needs INDEX(title)
      
    2. From B, it can reach for BC:

       BC needs INDEX(BookId)
      
    3. Similarly:

       LB needs INDEX(BranchId)
      

    If you are using MySQL, be aware that a PRIMARY KEY is an index. And every table needs a PRIMARY KEY. Also a PRIMARY KEY is necessarily unique. So, when I said "needs", you may find that there is already a PRIMARY KEY satisfying the need.

    More: http://mysql.rjweb.org/doc.php/index_cookbook_mysql

    I will quibble with the schema -- why is the "number of books" not simply a column in Books?

    As for query 2, it is even less clear which table each column might be in.

    Do be aware that an INDEX can contain only columns from a single table.